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The development of a mammalian cell line that overexpresses Myc and the purification of significant quantities of c-Myc 
from these cells is described. Three types of c-Myc~driven protein oligomerization (or complex) formations are described: (1) ho- 
mo-oligomer complexes (herein termed CI complexes) formed by association of at least two peptides of c-Myc, (2) hetero-oligom- 
er complexes (herein termed C2 complexes) formed by heterodimerization of at least two peptides, at least one of which is not the 
c-Myc peptide, and specifically hetero-oligomerization between c-Myc and a 26-29 kd factor, and (3) c-Myc-dependent hetero-ol- 
igomeric complexes (herein termed the C2' complex) formed in the presence of c-Myc, however such hetero-oligomeric proteins 
not containing any peptides which are c-Myc. The invention is further directed to a reliable and accurate method for objectively 
classifying compounds, including human pharmaceuticals, as inhibitors of c-Myc activity. 
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TTTTJE OF THE INVENTION 

C-MYC DNA BINDING PARTNER S, MO TIFS, SCREENING 
ASSAYS, AND USES THEREOF 

Cross-Reference to Relat ed Applications 
5 This application is a ccmtinuation-in-part of U.S. Application No. 

07/510,253, filed April 19, 1990. 

Firfrf nf the Tnvention 

This invention is directed to methods for the purification of 
mammalian Myc protein, and methods for the identification of compounds 
10 that inhibit c-Myc transcriptional activity. 

BACKGROUN D OF THE INVENTION 

Myc is a nuclear oncogene whose aberrant expression is associated 
with many different types of human cancers in many different tissues 
(Cole, M.D., Ann. Rev. Genet. 20:361-384 (1986)). While the mechanism 

15 of c-Myc oncoprotein action remains unknown, it clearly plays a role in the 
control of cell growth and differentiation (Luscher and Eisenman, Genes <£ 
Dev. 4:2025-2035 (1990); Penn et al., Sem. Cancer Biol. 7:69 (1990)). 
One plausible mechanism of Myc action is as a regulator of transcription in 
a pathway directly controlling proliferation and differentiation. This model 

20 is consistent with several observations. First, Myc has long been known as 
a nuclear protein with a general affinity for DNA (Abrams et al., Cell 
29:427-439 (1982); Alitalo etaL, Nature 306:274-277 (1983); Dormer 
etaL, Nature 296:262-265 (1982); Persson and Leder, Science 225:718- 
721 (1984)), and recently a site has been identified which is specifically 

25 bound by bacterially expressed variants of c-Myc (Blackwell et al. , Science 
250:1149-1151 (1990); Prendergast and Ziff, Science 25J:186-189 (1991)). 
Second, full length c-Myc has been shown to both activate and repress 
genes in transient transfection assays (Kaddurah-Daouk et al., Genes <£ 
Dev. i:347-357 (1987); Yang et al., Mol. Cell. Biol. 77:2291-2295 
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(1991)), and will weakly stimulate transcription when fused to a 
heterologous DNA-binding domain (Lech etaL, Cell 52:179-184 (1988); 
Kato et aL, MoL Cell BioL JO.5914-5920 (1990)). And finally, sequence 
similarities described below place Myc in the company of known 
5 transcription factors. 

Myc contains two domains that suggest it oligomerizes, perhaps as a 
dimer, and binds specifically to DNA: a leucine zipper domain and a basic- 
helix-loop-helix (B-HLH) domain. The leucine zipper is an a-helical 
structure found in sequence specific DNA-binding proteins such as Fos and 
10 Jun where it mediates homo- or heterodimerization via a coiled-coiled 
interaction (Landschulz etaL, Science 240:1759-1764 (1988); O'Shea 
etaL, Science 243:538-542 (1989); and reviewed in Busch and Sassone- 
Corsi, 77G 6*:3640 (1990)). This dimerization is necessary for DNA 
binding (Gentz et al, Science 243:1695-1699 (1989); Halazonetis et al., 
15 Cell S5:91742A (1988); Kbuzarides and Ziff, Nature 336:646451 (1988); 
Turner and Tjian, Science 243:1689-1694 (1989)). The HLH region also 
appears to mediate oligomerization necessary for DNA binding in several 
developmentally important proteins (Murre et aL, Cell 58:537-544 (1989); 
Murre et al., Cell 55:777-783 (1989)). HLH proteins form a large and 
20 growing family and include the products of the achaete-scute and 

daughterless genes responsible for neural development in Drosophila, the R 
gene family which regulates pigment pattern in corn, MyoD and several 
other proteins involved in muscle specific differentiation in vertebrates, and 
a centromere binding protein, CBF1, from yeast (Braun et aL, EMBO J. 
25 8:701-709 (1989); Cai and Davis, Cell 67:437-446 (1990); Caudy et aL, 
Cell 55:1061-1067 (1988); Crotunffler et aL, Genes & Dev. 2:1666-1676 
(1988); Davis et aL, Cell 52:987-1000 (1987); Edmondson and Olson, 
Genes &. Dev. 3:628-640 0989); Ludwig and Wessler, Cell 62:849-851 
(1990); Rnney etaL, Cell 53:781-793 (1988); Rhodes and Konieczny, 
30 Genes & Dev. 3:2050-2061 (1989); Villares and Cabrera, Cell 50:415-424 
(1987); Wright et aL, Cell 56:607-617 (1989)). While many proteins 



SUBSTITUTE SHEET 



WO 93/08701 



PCT/US92/08603 



-3- 



contain either an HLH or leucine zipper motif, Myc is one of a smaller 
number of proteins which contain bom an HLH and a leucine zipper. Both 
the leucine zipper containing proteins and the HLH proteins require a 
stretch of basic amino acids adjacent to the dimerization motif to contact 
5 DNA (reviewed in Busch and Sassone-Corsi, TIG 6:36-40 (1990); Jones, 
N., Cell 67:9-11 (1990)). Interestingly, all B-HLH proteins appear to bind 
to closely related DNA sequences known as E-Boxes. These are sequence 
motifs found in the immunoglobulin and other tissue specific enhancers 
having a core of NNCANNTGNN [SEQ ID No. 16] where different central 
10 bases are preferred by different B-HLH proteins and the flanking bases can 
affect binding affinity (Blackwell et al., Science 250:1149-1151 (1990); 
Blackwell and Weintraub, Science 250:1104-1110 (1990)). The core of the 
reported binding site for c-Myc, CACGTG, fits this pattern and has the 
same core sequence as the upstream sequence element (USE) of the 
15 Adenovirus major late promoter (Blackwell et aL, Science 250:1149-1151 
(1990); Prendergast and Ziff, Science 252:186-189 (1991)). A cellular 
transcription factor (USF or MLTF) which binds to the USE has recently 
been cloned and also contains a B-HLH domain adjacent to a leucine ripper 
(Gregor et al., Genes & Dev. 4:1730-1740 (1990)). 
20 Many of these B-HLH or leucine zipper proteins have been found to 

form not only homodimers but heterodimers with other proteins having like 
dimerization motifs (reviewed in Busch and Sassone-Corsi, 77G 6:36-40 
(1990); Jones, N., CeU 67:9-11 (1990)). Heterodimerization between 
specific groups of B -HLH or leucine zipper proteins can alter their DNA 
25 binding properties. While homodimers might bind weakly, heterodimers 
with the appr o pri ate partner can bind with increased affinity and in some 
cases with a new specificity (Jones, N., Cell 61:9-11 (1990); Blackwell and 
Weintraub, Science 250:1104-1110 (1990); Wright etaL, Mol. Cell Biol 
JJ:4104-4110 (1991)). Myc is capable of forming a homo-oligomer at high 
30 concentrations in vitro (Dang et al.. Nature 357:664-666 (1989); Kerkhoff 
and Bister, Oncogene 6:93-102 (1991)), although it is not clear whether 

SUBSTITUTE SHEET 



BNSDOCID: <WO 9308701 A 1 J_> 



WO 93/08701 



PCT/US92/08603 



that homc-oligomer actually forms in vivo (Dang et al., Mol Cell. Biol. 
11:954-962 (1991)). It seems likely that Myc directly interacts with other 
cellular protein(s) to form hetero-oligomer(s) , and indeed one such 
•partner" protein, designated Max, has recently been identified (Blackwood 
5 and Esenmann, Science 252:1211-1217 (1991)). The effect that such 
partner proteins have on Myc DNA-binding specificity is likely to be 
central to understanding die function of Myc. 

Much of the in vitro work done on B-HLH proteins has utilized in 
vitro transcribed and translated protein or. has used protein overexpressed in 

10 bacteria. Myc expressed by these means has been used to determine 

binding specificity and to demonstrate that Myc can form heterodimers with 
Max (Blackwell et al. Science 250:1149-1151 (1990); Prendergast and 
Ziff, Science 257:186-189 (1991); Blackwood and Esenmann, Science 
251:1211-1217 (1991)). Myc, however, is post-translationally modified by 

15 at least phosphorylation in mammalian cells (Hann and Esenmann, Mol. 

Cell. Biol. 4z2AS6-2497 (1984); Ramsay et al, Proc. Natl Acad. Sci. USA 
51:7742-7746 (1984)), and post-translational modifications are believed to 
regulate the function of many proteins, including the transcription factors 
Myb, Fos, HSF, CREB, and SP-1 (Abate etal. % Science 249: 1157-1 161 

20 (1990); Jackson etal., Cell 63: 155-165 (1990); Luscher et al , Nature 
344517-522 (1990); Sorger et al, Nature 529:81-84 (1987); Yaroamoto 
etal, Nature 534:494-498 (1988)). In addition, Myc produced in avian 
cells has been reported to bind more tightly to DNA cellulose than 
bacterially produced Myc (Kerkhoff and Bister, Oncogene 5:93-102 (1991)). 

25 Several lines of evidence argue that the biochemical function(s) of 

Myc will be determined in large part by hetero-oligomerization with Max 
and perhaps with other, as yet unidentified, factors. A complete 
understanding of the function of c-Myc will therefore require the 
identification of all partner proteins and a functional characterization of the 

30 complexes that these proteins form in the absence or presence of c-Myc. 
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To unravel the nature of Myc's function it will be necessary to determine 
not only the binding properties of all relevant complexes but to ascertain 
how they differ in action once bound. Post-translational modification might 
play a role in modulating the formation, binding, or further activities of 
these complexes and the availability of large quantities of modified c-Myc, 
such as described here, should facilitate a biochemical approach to this 
problem* Such studies should lead us to an understanding of the complexes 
available at different times in different cell types and the consequences for 
each cell in terms of appropriate growth and differentiation, or 
oncogenesis. 

Further, to date, no inhibitors of c-Myc action have been identified. 
The identification of such inhibitors has suffered for lack of identification 
of a specific DNA binding sequence to which c-Myc binds, and for lack of 
a simple, inexpensive and reliable screening assay which could rapidly 
identify potential inhibitors and active derivatives thereof . Thus a need also 
still exists for rapid, economical screening assays which identify specific 
inhibitors of c-Myc activity. 

SUMMARY OF THE INVENTION 

Recognizing the potential importance of inhibitors of c-Myc 
oncoprotein activity in the therapeutic treatment of many forms of cancer, 
and cognizant of the lack of a simple assay system in which such inhibitors 
might be identified, the inventors have investigated c-myc DNA binding. 

These efforts led to the development of a mammalian cell line that 
overexpresses Myc and the purification of significant quantities of c-Myc 
.from these cells. These efforts culminated in the discovery of three types 
of c-Myc-driven proton oligomerization (or complex) formations: (1) 
homo-oligomer complexes (herein termed CI complexes) formed by 
association of at least two peptides of c-Myc, (2) hetero-oligomer 
complexes (herein termed C2 complexes) formed by heterodimerization of 
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at least two peptides, at least one of which is not the c-Myc peptide, and 
specifically hetero-oligomerization between c-Myc and a 26-29 kd factor, 
and (3) c-Myc-dependent hetero-otigomeric complexes (herein termed the 
C2' complex) formed in the presence of c-Myc, however such hetero- 
oligomeric proteins not containing any peptides which are c-Myc. 

Accordingly, the invention is directed to a reliable and accurate 
method for the purification of Myc from a mammalian source. 

The invention is further directed to the use of oUgomers containing 
the DNA motif 5'-CACGTG-3\ in its double stranded DNA form, as a 
reliable and accurate method for the detection of the presence of CI 
complexes in a sample. 

The invention is further directed to the use of the DNA motif 5'- 
CACGTG-3', in its double stranded DNA form, as a reliable and accurate 
method for the detection of C2 complexes in a sample. 
15 The invention is further directed to the use of the DNA motif 5'- 

CAGCTG-3% in its double stranded DNA form, as a reliable and accurate 
method for the detection of C2 ' complexes in a sample. 

The invention is further directed to a 26-29 KD protein fraction 
purified from Chinese hamster ovary (CHO) cells or baculovirus, such 
20 protein fraction containing at least one peptide capable of forming C2 
complex oligomers with c-Myc. 

The invention is further directed to a 40-50 kD protein fraction 
purified from CHO cells, such protein fraction containing at least one 
peptide capable of forming C2' complex oligomers in the presence of c- 
25 Myc. 

The invention is further directed to a reliable and accurate method 
.for objectively classifying compounds, including human pharmaceuticals, as 
inhibitors of c-Myc activity, and especially as an inhibitor of CI complex 
formation, C2 complex formation or CI' complex formation. 

The invention is further directed to a reliable and accurate method 
for objectively classifying compounds, including human pharmaceuticals, as 
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inhibitors of c-Myc activity, and especially as an inhibitor of CI complex 
DNA binding, C2 complex DNA binding, or C2' complex DNA binding. 

The invention further provides a method for identifying and 
classifying the mechanism of action of a bioactive c-Myc-inhibiting 
compound. 

The invention further provides an assay for the monitoring of the 
isolation and/or purification of a peptide capable of forming a C2 or C2 ' 
complex, or a mixture of such peptides from a crude preparation. 

The invention further provides an assay for the monitoring of the 
isolation and/or purification of an c-Myc-inhibiting compound or mixture of 
such compounds from a crude preparation of such compounds. 

TOTKF INSCRIPTION OF THE DRAWINGS 

Fig. 1. Pff rif^ c-Mvc Protein . A) 1 #tg of c-Myc protein purified 
from the 5A overexpressing CHO cell line was subjected to 2-dimensional 
gel electrophoresis. An isoelectric focusing tube gel was run with pH 5-7 
ampholytes followed by SDS-PAGE and silver staining. The Myc proteins 
are bracketed and arrows distinguish the 60, 62, and 72 kD species. The 
gel was trimmed for this figure; the actual pi range for the Myc proteins 
was 5.0-5.6. B) 0.5 #tg of purified c-Myc protein from the indicated cell 
lines was electrophoresed on an SDS gel and either visualized by silver 
staining (left lane) or electroblotted to nitrocellulose and subjected to 
immunoblotting using the ST-2 polyclonal antibody (right 2 lanes). 

Fig. 2. t>ma Binding nf Purified c-Mvc Proteins. The EMSA was 
carried out as described in materials and methods using equal amounts 
(approximately 2 ng) of the following probes and 0.5 fig of either purified 
CHO produced c-Myc or baculovirus produced c-Myc: 0*E2) 3 lanes 1 and 
7, G*E3) 3 lanes 2 and 8, MLC-A lanes 3 and 9, MLC-B lanes 4 and 10, 
(USE>3 lanes 5, 11, and 12, and HSE lanes 6 and 13. Full probe 
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sequences are given in materials and methods. Lanes 1-6 and lanes 7-13 
are different exposures of lanes from the same gel. 

Hg.3. n TOnriintr Arfr^ i* Present in MVC COnB"™nr Slices of 
sns Gels. 400 fig of GEO produced c-Myc or 163 fig of baculovirus 
produced c-Myc was separated on an SDS-PAGE geL Proteins from 0.5 
cm slices were recovered, renatured as described in materials and methods, 
and analyzed by EMSA using the (USE)3 probe. 0.4 /ig of the CHO My c 
load and 5 /d of the protein from the CHO Myc-containing slice were 
analyzed on the same gel aeft panel). 0.37 fig of the baculovirus Myc load 
and 5 /d of the protein from the baculovirus Myc dice were analyzed on 
the same gel (right panel). Slices from other molecular weight ranges of 
the same gel showed no binding (data not shown). 

Fig . 4. A«KXtv i* Form prf tw c-Mvc and a 26-2Q VP Factor. 
Proteins from gel slices were recovered and analyzed by EMSA as 
15 described in materials and methods using the (USEfe probe. Lanes 1-4 
represent proteins from the same gel loaded with baculovirus produced 
Myc described for Fig. 5. These lanes contain 0.37 fig of the loaded 
material (lane 1), 0.75 fig BSA with 7.5 ^1 of proteins from either a Myc 
slice (lane 2) or a 26-29 kD slice (lane 3), or 7.5 of each slice used for 
20 lanes 1 and 2 plus 0.2 fig of BSA (lane 4). Lanes 5-8 and 10 contain 

proteins from gels loaded with Myc purified from CHO cells. These lanes 
contain 0.47 of the gel load (lane 5), 4 fd of material from a Myc slice of a 
gel loaded with 400 /tg of Myc (lane 6), 7 /d of material from a 26-29 kD 
slice of a similar gel plus 0.8 fig Protein A (lane 7), and both 4 fil of the 
25 Myc slice and 7 fd'of the 26-29 kD slice (lane 8). Lanes 9-12 utilize the 
bacterially expressed Protein A-Myc fusion proteins containing either the 
Myc B-HLH and leucine zipper domains (amino acids 353-439) or lacking 
the basic region and containing Myc amino acids 372-439. These were 
expressed and purified as described in materials and methods. Lane 9 
30 contains 0.5 y% of Protein A-Myc(353^39) and lane 10 contains the same 
plus 7 fd of the 26-29 kD slice. Lane 1 1 contains 1 fig of Protein A- 
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Myc(372-439) and lane 12 contains 0.5 fig of Protein A-Myc(372-439) plus 
7 fd of the 26-29 kD slice. 

Fig. 5. CT Binding Activity Requires a 40-50 Kd Factor. A) 101 
fig of CHO produced c-Myc was. separated on an SDS get Proteins were 
recovered, resuspended in 100 pi, and renatured and analyzed by EMSA 
using the ERP3/4 probe. This probe contains the portion of the MLC 
enhancer that encompasses the /xE2 site. EMSA samples contained 0.3 fig 
of the SDS gel load (lane 1), 7.5 jd of the proteins from the Myc slice 
(lane 2), or the 40-50 kD slice (lane 3), or 7.5 fd of both slices renatured 
together (lane 4). B) EMSA samples contained 6.9 fig purified baculovirus 
produced c-Myc (lane 5), 3 id of protein from the 40-50 kD slice of a gel 
loaded with 400 fig CHO produced c-Myc (lane 6), or both renatured 
together (lane 7j._ The probe was ERP1/2. Q EMSA samples contained 
10 pi (0.9 /xg) of bacterially produced c-Myc fusion protein containing Myc 
amino acids 353-439 (lane 8), 0.47 fig of CHO produced c-Myc (lane 9), 5 
fd of protein from the 40-50 kD slice of a gel loaded with 400 fig of the 
CHO Myc shown in lane 9 (lane 10), or 5 fd of the same 40-50 kD 
material renatured in the presence of either 0.9 jug. of the baculovirus 
produced Myc shown in lane 5 (lane 11), 2 fd (0.18 fig) of the bacterially - 
produced Myc fusion protein containing Myc amino acids 353-439 (lane 
12), or 4 fd (0.36 fig) of the same bacterially produced Myc fusion protein 
(lane 13). The probe was ERP1/2. 

Fig. 6. A" rihnrfies to c ~ Mvc Intera ct with the CI and C2 
Complexes . EMSA reactions were set up with the indicated Myc protein 
preparations (0.37 fig baculovirus produced c-Myc or 0.47 fig of CHO 
produced c-Myc). These reactions were preincubated 30 min on ice in the 
presence of the indicated antibody (cr-Myc monoclonal 1F7 or a 
monoclonal directed against the lambda repressor, cl). 1 ng of SMS probe 
or /iE2-containing probe number 7 (see Fig. 7) was added subsequently and 
binding and electrophoresis were as usual. 
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Fig. 7. nitpnniicleotiH ^ Selected from Kandom Sequence after 8 
Rounds of EMSA . Sequences were selected from oligonucleotides 
containing 20 base pairs of random sequence using a reiterative EMS A 
procedure described in materials, and methods. Underlined nucleotides are 
5 from the PCR primer sites. Tables below the aligned sequences tabulate 
the frequency of each base in the 6 flanking positions surrounding the 
CACGTG motifs. Only bases next to a perfect fit of the CACGTG core 
were tabulated since sequences without this core were found not to function 
as high affinity binding sites (Fig. 8, and data not shown). Bold numbers 
10 adjacent to individual sequences indicate those oligonucleotides which were 
tested individually by EMSA in Fig. 8. Asterisks indicate additional 
sequences which were tested individually (data not shown). 

Fig. g. ^rf^t«f Sites f"T" BBdjgjBd Complexes. EMSA was 
carried out using either 2.8 ng of die SMS probe or equal amounts (1 ng) 
15 of probes 1-11 indicated in Fig. 7. Probes 1-11 were labeled and gel 

isolated in parallel and had approximately equal specific activities. Binding 
reactions contained either no additional protein (-), 0.37 pg of baculovirus 
produced c-Myc <B) or 0.47 /ig of CHO produced c-Myc (Q. Free probe 
is visible at the bottom of the gel. 
20 Fig. 9. Qf£B3teofJhe H » nA ™ Complexes. The standard 

EMSA reaction was scaled up for 11 samples containing 0.4 M g of purified 
baculovirus produced c-Myc per sample. Probe and competitor were 
(USEfe. After allowing 20 min for binding 20 jil was loaded on a prerun 
EMSA gel as a measure of the starting amount of complex (ST) and 
25 enough cold competitor was added to the remaining sample to achieve a 
250 fold molar excess over probe. Immediately upon addition of 
competitor the sample was gently mixed and 20 fil aliquots were loaded at 
the indicated times (0, 30 s, 1 min, 4 min, etc.). A control sample (C) 
was made up individually in which competitor was added prior to the start 
30 of binding to demonstrate complete competition. This sample was loaded 
at the same time as the ST sample. All samples were loaded on a 
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continuously running gel so that the complex in the starting lane runs ahead 
of the equivalent complex in lanes loaded later. 



OF THE 

In the description that follows, a number of terms used in 
5 recombinant DNA technology are extensively utilized. In order to provide 
a clearer and consistent understanding of the specification and claims, 
including the scope to be given such terms, the following definitions are 
provided. 

Oligomer of Interest . As used herein, an "oligomer of interest" 

10 refers to any of the following types of oligomeric proteins: first, Myc- 
containing oligomers including homo-oligomers of Myc peptides (a CI 
complex), and hetero-oligomers containing at least one peptide of Myc and 
one peptide of a Myc "partner" (a C2 complex); second, oligomers that 
form in the presence of Myc-containing homo-oligomers or Myc-containing 

15 hetero-oligomers but which themselves do not contain the Myc peptide, 
such oligomers including non-Myc-containing homo-oligomers that 
associate in the presence of Myc and non-Myc-containing hetero-oligomers 
that associate in the presence of Myc (a C2' complex). 

Oligomer , An "oligomer" as it refers to proteins, means a protein 

20 composed of more than one peptide subunit, such as dimers, trimers, 

tetramers, etc. Such oligomeric protein may be a homo-oligomer, that is, 
composed entirely of two or more identical subunits; alternatively, such 
oligomeric protein may be a hetero-oligomer, that is, composed of at least 
two different peptides. Oligomers containing three or more peptides may 

25 contain more than one copy of a peptide. 

n' Proteinfs> . As used herein, for convenience, a "C2* protein" is 
a protein or peptide that is a member of the second class of the "oligomers- 
of-interest," that is, a protein that forms oligomers in the presence of Myc, 
c-Myc homo-oligomers or Myc-containing hetero-oligomers so as to bind to 

SUBSTITUTE SHEET 



BNSOOCID: <WO 93067O1A1_l_> 



WO 93/08701 



PCT/US92/08603 



-12- 



a specific DNA sequence, but which does not contain a Myc peptide, such 
oligomers including non-Myc-containing homo-oligomers that associate in 
the presence of Myc and non-Myc-containing hetero-oligomers that 
ass oc iate in the presence of Myc,. 
5 n perahlv-linked. As used herein, two macromolecular elements are 

operably-linked when the two macromolecular elements are physically 
arranged such that factors which influence the activity of the first element 
cause the first element to induce an effect on the second element. For 
-example, the transcription of a coding sequence which is operably-linked to 
10 a promoter element is induced by factors which "activate- the promoter's 
activity; transcription of a coding sequence which is operably-linked to a 
promoter element is inhibited by factors which -repress- the promoter's 
activity. Thus, a promoter region would be operably-linked to the coding 
sequence of a protein if transcription of the coding sequence activity was 
15 influenced by the activity of the promoter. 

Response. As used herein, the term -response- is intended to refer 
to a change in any parameter which can be used to measure, indicate or 
otherwise describe c-Myc action or oligomer (homo-oligomer (CI complex) 
or heterooUgomer (C2 complex)) formation, including c-Myc dependent - 
20 heterc-oUgomerization (C2' complex) formation. The response may be 

revealed as a physical change (such as a change in phenotype) or, it may be 
revealed as a molecular change (such as a change in a reaction rate or 
affinity constant). Detection of the response may be performed by any 
means appropriate, -Detecting- refers to any method by which such 
25 response may be evaluated, so as to provide a meaningful indicia of whether 
me event has occurred. 

Pnm ponnd. The term -compound- is intended to refer to a 
chemical entity, whether in the solid, liquid, or gaseous phase. The term 
should be read to include synthetic compounds, natural products and 
30 macromolecular entities such as polypeptides, polynucleotides, or lipids, 
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and also small entities such as neurotransmitters, ligands, hormones or 
elemental compounds. 

Bioactive Compound . The term "bioactive compound" is intended 
to refer to any compound which induces a detectable or measurable 
response in the methods of the invention. 

Promoter. A "promoter" is a DNA sequence located proximal 
to the start of transcription at the 5' end of the transcribed sequence. The 
promoter may contain multiple regulatory elements which interact in 
modulating transcription of the operably-linked gene. 

Expression . Expression is the proces s by which the information 
encoded within a gene is revealed. If the gene encodes a protein, 
expression involves transcription of the DNA into mRNA, the processing 
of mRNA (if necessary) into a mature mRNA product, and translation of 
the mature mRNA into protein. 

A nucleic add molecule, such as a DNA or gene is said to be 
•capable of expressing" a polypeptide if the DNA contains the coding 
sequences for the polypeptide and expression control sequences which, in 
the appropriate host environment, provide the ability to transcribe, process 
and translate the genetic information contained in the DNA into a protein 
product, and if such expression control sequences are operably-linked to the 
nucleotide sequence which encodes the polypeptide. 

Cloning vehicle . A "cloning vehicle" is any molecular entity that is 
capable of delivering a nucleic add sequence into a host cell for cloning 
purposes. Examples of cloning vehicles include plasmids or phage 
genomes. A pl&smid that can replicate autonomously in the host cell is 
especially desired. Alternatively, a nucleic add molecule that can insert 
into the host cell's chromosomal DNA is especially useful. 

Cloning vehicles are often characterized by one or a small number 
of endonuclease recognition sites at which such DNA sequences may be cut 
in a determinable fashion without loss of an essential biological function of 
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flie vehicle, and into which DNA may be spliced in order to bring about its 

replication and cloning. 

The cloning vehicle may further contain a marker suitable for use in 

the identification of cells transformed with the cloning vehicle. Markers, 
5 for example, are tetracycline resistance or ampicillin resistance. The word 

"vector" is sometimes used for "cloning vehicle." 

fe BBSaaa vehicle. An -expression vehicle" is a vehicle or vector 

similar to a cloning vehicle but is especially designed to provide sequences 

capable of expressing the cloned gene after transformation into a host. 
10 In an expression vehicle, the gene to be cloned is usually operably- 

linkcd to certain control sequences such as promoter sequences. 

Expression control sequences will vary depending on whether the vector is 

designed to express the operably-linked gene in a prokaryotic or eukaryotic 

host and may additionally contain transcriptional elements such as enhancer 
15 elements, termination sequences,- tissue-specificity elements, and/or 

translational initiation and termination sites. 

Host . By "host" is meant any organism that is the recipient of a 

cloning or expression vehicle. 



20 




rM"fr* n tf Activity 



Although there have been previous reports of purified Myc protein, 
the present inventors found that the Mvc protein preparations described 
therein, and the methods used to isolate that protein, failed to achieve the 
25 requisite amount of yield needed to sequence characterize Myc action in 
mammalian sources. The inventors have overcome this problem and 
describe, for the first time, a unique and useful method for the isolation of 
highly purified mammalian c-Myc protein which provides the requisite high 
degree of quantity of mammalian c-Myc protein needed for the 
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characterization of c-Myc directed DNa binding and biological action. The 
inventors have also been able to purify large quantities of Myc from a 
recombinant insect cell system. The purified Myc protein of the invention 
exhibits the only known biochemical activity of c-Myc, an ability to bind 
the sequence CACGTG. As a direct result of the method of the invention 
for the isolation of c-Myc protein, the inventors were able to identify 
peptides that naturally g^or**** with c-Myc in a hetero-oligomers, or 
peptides that naturally associate with each other as a result of the action of 
c-Myc, such peptides found to be present in certain column 
chromatography fractions of the c-Myc purification scheme. 

Accordingly to the invention, purification of Myc from a 
mammalian source is preferably achieved utilizing a mammalian cell line 
that overexpresses either recombinant or non-recombinant c-Myc and is 
performed completely on ice or equivalent temperatures of 0-5°C, using 
reagents and buffers at the same temperature. For example, the 
overexpressing Chinese hamster ovary (CHO) cell line 5 A is useful for 
such purification. In CHO 5A cells, recombinant mouse c-Myc is under 
the control of a regulatable promoter, and has been integrated and 
amplified in the genome of the parent CHO cell line for maximum stability 
and production. Except where otherwise noted, for the methods and assays 
of the invention, the native or recombinant Myc should include at least the 
two coding exons of Myc. 

After collecting the cells by centrifiigation using techniques known 
in the art, and prior to lysis of the outer cell membrane, the cells should be 
washed at least once in a low salt neutral buffer such as 0.9% NaCl in 10- 
50 mM phosphate, pH 7.0-7-3 (phosphate buffered saline, PBS) to remove 
remaining growth medium. 

Lysis of the washed cells is also achieved in a low salt, neutral to 
mildly acidic lysis buffer, preferably about pH 6,8, containing at least one 
protease inhibitor, such as aprotinin or phenylmethylsulfonyl fluoride 
(PMSF), preferably containing a combination of such inhibitors. Salts such 
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as potassium On the KCt form) and magnesium On the MgCl 2 form) are 
also preferably added. In addition, nonionic detergents such as NP40 (0.5% 
v/v) and Na-deoxycholate (0.1 %) should be added. 

Cell outer membrane lysis should be performed under conditions 

5 that lyse the host cell without lysing the nucleus, or induce significant 

leakage from the nuclear membrane. The cells may be allowed to sit for a 
short period of time, for example, 10 minutes, in the detergent-contairung 
lysis buffer before mechanical intervention is utilized in the lysis step. 
Mechanical intervention is best performed with a gentle disruption of the 

10 detergent treated cells, for example, utilizing 40 strokes in a Dounce 
homogenizer with a type A pestle, or the equivalent of such treatment. 

Nuclei may be collected from the lysed cell preparation using 
techniques known in the art, such as, for example, centrifiigation at lOOOxg 
for 5 min at 4°C and washed at least once in the same low salt lysis buffer 
15 used to lyse the outer cell membrane. 

Nuclei are then resuspended in the low salt lysis buffer that 
additionally contains sufficient DNAse I and incubated for a time sufficient 
to efficaciously degrade the DNA in such nuclei to_a size and viscosity that 
allows subsequent purification of the c-Myc from this preparation as 

20 described below. 

Following the DNAse I treatment, the sample is diluted with a high 
salt neutral buffer that brings the salt (as Nad) concentration of the sample 
to at least 2 M. Such high salt buffer preferably additionally also contains 
amounts Mgd 2 sufficient to maintain the same concentration of this salt in 
25 the final diluted preparation, and also additional detergent NP40so as to 
retain efficacious levels after sample dilution. 

In mammalian host cells, c-Myc is generally tightly associated with 
the nuclei. Accordingly, it is necessary to solubflize c-Myc in a manner 
that does not destroy its biological activity or its ability to renature into a 
30 biologically active form. The residual nuclear material is first removed by 
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centrifugation and then the pellet resuspended for solubilization of the c- 
Myc. Solubilization of the c-Myc protein in a manner that destroys this 
association may be achieved with either sodium dodecyl sulfate (SDS) or 
urea at concentrations greater than 4 M. Preferably, 5M urea is utilized. 
Residual non-lysed nuclei may also be solubilized at this time by vigorous 
stining for about 30 min. The solution is then centrifuged to pellet any 
remaining insoluble material prior to the subsequent chromatography steps, 
for example, at 5000xg for about 10 min. 

The supernatant fraction recovered from the centrifugation step is 
applied to a DEAE Sepharose CL-6B column equilibrated in the urea- 
containing buffer as described above, and the column thoroughly washed 
with such buffer to remove unbound protein. A second wash was 
performed with the addition of an intermediate amount of NaCl, 0.1M 
NaCl to the buffer. Finally, Myc protein was eluted by raising the salt 
concentration in the buffer to 0.35M. 

All protein eluting with the 0.35M salt wash were collected and 
applied to a FPLC Mono-Q column. The column was washed and with a 
gradient of 0.10 M NaCl to 0.35 M NaCl, followed by a 2 M NaCl step 
wash. Holding the gradient at intermediate salt concentrations, for example 
at about 0. 19 M NaCl, until the end tail of the contaminating protein is 
finished eluting will enhance the purity of the subsequently eluted Myc 
protein. 

Myc may be identified in the column eluent by any technique that 
specifically recognizes Myc protein or its activity. For example, a 
monoclonal antibody such as 1F7 may he used in an immunoassay for the 
presence of Myc protein. Alternatively, DNA binding activity to an 
oligonucleotide containing the sequence 5'-CACGTG-3 f may be used to 
monitor the purification. Monoclonal antibody 1F7 is directed against the 
peptide sequence of amino acids 305-317 in murine c-Myc. Other Myc 
monoclonal antibodies useful in such assays are commercially available. 
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Pools of fractions from this column contained the C2 and C2' 
binding activities described below, and the presence of peptides capable of 
entering into C2 and C2' hetero-oligomers, and especially C2 and C2' 
hetero-oligomers, may be assayed by the ability of such hetero-oligomers to 
bind to the DNA sequences 5'-CACGTG-3' and S'-CAGCTG-3\ 
respectively- Myc purified from the CHO cells appeared as multiple bands 
by immunobloL 

b. Purification <vMvc and Tts Partners From a 
Ffl?y l " vir "* source 

Human c-Myc may also been purified using the baculovirus 
overexpression system. For purification, SB cells that had been infected 
with recombinant baculovirus carrying the c-Myc gene, using techniques 
known in the art were harvested just prior to the onset of lysis (-48 hours 
post infection). Solubilization and purification of the recombinant c-Myc 
were carried out as with the CHO produced Myc resulting in a yield of 2.5 
mg/8xl0 8 cells. Myc purified from these insect cells was apparently 
homogeneous by silver staining, and ran on electrophoresis as a single 
diffuse band of - 60kD. This was in contrast to the multiple bands 
observed with mammalian Myc by immunoblot (Fig. IB). 

c j\ trrrinn of ffcmmnce Sr *™fie ™TA Binding Activity 

The above preparations contain two sequence specific DNA-binding 
activities that both contain Myc protein. The first activity contains only 
Myc (i-e-, forms the Myc homo-oligomer) and binds very weakly to 
sequences with the core CACGTG. The binding is assayed by determining 
the off rate and by competitor assays, both techniques known in the art. 
The binding of c-Myc homo-oligomers is characterized by an immeasurably 
fast off rate and by the observation that it is almost impossible to add 
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cnough cold competitor sequence to completely compete away this complex 
in electrophoretic mobility shift assays (EMS A). This latter observation 
- implies that it may not be possible to raise oligonucleotide concentrations 

above the K D , thus preventing the determination of exactly what fraction of 
5 the final Myc preparations are active for sequence specific binding by the 
Myc homo-oligomers. 

A binding ate selection procedure may be used to determine the 
optimal binding site for Myc. Sites may be selected from a pool of random 
oligomers, such as 20-mers, in order to decrease bias in determining an 

10 optimal binding site* A 12 base consensus sequence of GACCACGTGCTC 
[SEQ ID No. 1] may be used, with the central E box core of CACGTG 
appearing to be most conserved. Halazonetis and Kandil (Halazonetis and 
Kandil, Proc Natl Acad. Sri. USA 88:6162-6166 (1991)) assumed that the 
flanking sequences might be symmetric, and reported an optimal sequence 

15 of GACCACGTGGTC [SEQ ID No. 2]. This sequence is quite similar to 
the consensus that is preferred here, differing in only the 10th position 
(where predominantly a C was utilized in the invention, although G is 
significantly represented Fig. 7, Group I). Accordingly to the invention, it 
is possible to select a 12 base consensus sequence from a pool of predicted 

20 complexity of 4 20 (~ 10 12 ) thus indicating that Myc has a strong sequence 
preference despite its apparent weak binding affinity. 

The second Myc containing DNA-binding complex provided in the 
preparations of the invention also binds to sequences with a core of 
CACGTG, but binds significantly more tightly than Myc alone. This 

25 complex (the C2 complex) requires a 26-29 kD factor in addition to Myc. 

This additional factor copurified with Myc, presumably because of similar- 
chromatographic properties and not via association with Myc since the 
chromatography performed in 5M urea would denature such association. 
This additional factor resembles Max, a proton whose gene was recently 

30 isolated from mammalian cells, in that it does not bind efficiently to DNA 
by itself but can hetero-oligomerize with Myc to bind tightly to the 
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sequence CACGTG. However, that the factor of the invention differs from 
Max in its apparent size (Max is reported to migrate at 21 kD). 
Additionally, the Myc/Max hetcro-oligomcr appears to migrate at least as 
slowly as a Myc only complex in EMS As, while the C2 complex of the 

5 invention migrates more rapidly than Myc alone. 

In addition to the 26-29 kD factor, a second copurifying factor of 
40-50 kD has been identified. The sites selected by complexes containing 
this factor (herein termed C2' complexes) contained a CAGCTG core (the 
,iE2 sequence motif) as well as flanking sequences which bear a striking 

10 resemblance to a recently reported binding site for myogenin faomo- 
oligomers CWright et «L, MoL Cell. Biol. JJ:4104-*110 (1991)). 
Myogenin is an HLH containing protein of predicted molecular weight 32.5 
kD whose optimal binding site is AACAGT/CTGTT [SEQ ID No. 3]. 
None of the sites (0/36) selected by the C2 or C2 ' complexes of the 

15 invention contained a CAGTTG motif while roughly half of the myogenin 
selected sites contained such core sequences. 

d. ftrav for a rpmponnd t hat Inhibits Mvc Action 

For the ease in describing these assays, CI complex association 
and/or DNA'binding, C2 complex association and/or DNA binding, and 
20 C2' complex association and/or DNA binding are all referred to as c-Myc 
activity. 

Assays for c-Myc activity may be performed in vitro or in vivo. In 
vitro assays may be performed as described in the Examples, for example, 
by evaluating the effects the desired compound or various amounts of such 
25 compound on the results of the electrophoretic mobility shift assay and site 
selection techniques that will reveal whether binding of the oligomer of 
interest to a specific DNA sequence motif has occurred in the presence of 
the compound. 
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For the in vivo assay of a compound that inhibits the desired Myc 
activity at least two genetic constructs are utilized. First is required a 
recombinant construct capable of expressing Myc is required; second is 
required a reporter gene whose expression is operably linked to the Myc 
5 activity and especially to the binding of the desired oligomer to the specific 
DNA sequence or motif. 

If desired, a recombinant construct capable of expressing a C2 
complex protein or C2' complex protein may also be used. Alternatively, a 
host may be chosen may be chosen that naturally expresses such protein. 
10 Recombinant constructs that are capable of expressing Myc protein 

may be constructed utilizing the guidelines as described below or purchased 
commercially. 

The desired DNA binding sequence may be operably linked to any 
gene which confers a selectable marker in the host system. In a preferred 

15 embodiment, a marker gene which allows phenotypic selection in yeast, 
and especially in Saccharomyces cerevisiae is used. 

Yeast that have been co-transformed with both an expressible myc 
gene and with the desired DNA binding sequence may be used to (1) 
identify the presence or absence of endogenous host proteins that interact 

20 with Myc in a C2 or C2 'complex (2) classify a protein as a CI complex 
protein or as a C2 * complex protein; and (3) identify and classify 
compounds as agents which disrupt such Myc activity. C2 complex proteins 
have previously also been termed Myc "partner" proteins. 

All three applications are based on the same principle: in the 

25 presence of c-Myc biological activity, one of three things will happen: CI 
complexes will form; C2 complexes will form; or, C2' complexes will 
form. The protein complexes so formed, and especially the oligomeric 
complexes, will bind to a specific DNA motif, binding to such motif will 
be operably linked to the marker gene, and expression of the marker gene 

30 will be altered, preferably stimulated, in response to such DNA binding. 
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In the absence of such obgomerization, oUgomer-DNA complex formation 
will no t occur and expression of the marker protein will not be altered. 

In the assays of the invention, there may be some level of binding to 
a desired DNA binding sequence even in the absence of c-Myc. However, 
5 when c-Myc is present in the cell, the amount and strength of the specific 

DNA binding is increased. 

Hosts that have been co-transformed with both an expressible c-Myc 
gcnc and with the desired DNA binding sequence may be used to assay for 
the presence or absence of endogenous host proteins that interact with c- 

10 Myc activity. If such analyses reveal that the host contains c-Myc binding 
proteins, or c-Myc dependent oligomers which, in the presence of c-Myc 
specifically bind to a desired DNA sequence, such c-Myc partner protein or 
dependent-oligomer protein may be isolated using techniques known in the 
art such as gel mobility shift analysis, cDNA expression cloning vectors 

15 such as, for example, XgtlO and Xgtll, or other cloning systems 

specifically designed for high-efficiency cloning and expression of full- 
length cDNA in yeast such as, for example, pGl and pHSP56, all of which 
are commercially available (dontech, Palo Alto, California). 

It is not necessary that the host be completely deficient in C2 

20 complex proteins (c-Myc partner proteins) or C2' complex proteins to be 
useful in the method of the invention. As described below, if c-Myc is 
expressed at levels much greater than those found in the host, reporter gene 
transcription from endogenous partner proteins may be negligible, or of 
such low amount that it does not otherwise alter the utility of the methods 

25 of the invention. fc 

Ifthe c-Myc expression is transcribed with a strong promoter, 
and/or if the c-Afyc expression cassette is supplied on a high copy number 
vector, the levels of c-Myc will be high enough to overcome a low level 
background and such c-Myc constructs may be used to analyze the ability 

30 of cloned c-Myc partners to influence c-Myc DNA binding. One of 
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ordinary skill in the art can adapt the expression system to the level of 
expression desired using methods known in the art. 

The C2 complex protein (the partner protein), or the C2 ' complex 

protein, if supplied as a recombinant construct to the host cell, should be 

* 

5 capable of expressing at levels comparable to that of the c-Myc protein* 
C2 complex proteins may be identified by utilizing a phage plaque assay, 
as described in the commonly-owned, copending U.S. patent application 
entitled "Protein Partner Screening Assays and Uses Thereof, • Application 
No. 510,254, filed April 19, 1990, and incorporated herein by reference. 

10 Proteins identified by such screening assay can be subcloned into 

eukaryotic expression vectors known in the art and commercially available 
so as to provide a recombinant source of partner protein gene expression. 

The genetic constructs of the invention may be pla c ed on different 
plasmids, or combined on one plasmid. A construct may also be inserted 

15 into the genome of a host celL Preferably, the construct coding for the c- 
Myc protein and the construct coding for the C2 complex protein or the 
C2* complex protein are provided to the host on two different plasmids. 

It is important to establish that the effect of the compound is due to 
an effect on c-Myc activity and not an effect on the activity of the reporter 

20 product per se. Such effect can be established by comparing the results 

found in hosts which lack either the c-Myc expression vector or the C2 or 
C2 " protein expression vector or both. 

The desired DN A binding motif may be located at any site in the 
transcription cassette of the reporter gene which allows for die transcription 

25 of that gene to be operably-linked to binding of the desired oligomer. Thus, 
such motif may be located 5* to the transcriptional start site or 3' to the 
transcriptional start site, for example, in an intron, similar to its location 
relative to the promoter region in the immunoglobulin genes. 

The reporter gene whose expression is operably linked to c-Myc 

30 activity and especially to oligomer DNA binding may be any gene whose 
expression can be monitored. Any detectable phenotype change may serve 
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as the basis for the methods of the invention. In a preferred embodiment, 
the reporter gene is a gene not normally expressed by the host, or a gene 
that replaces the host's endogenous gene. Any reporter gene which is 
-capable of being operably-linked to a promoter capable of responding to the 
binding of the oligomer of interest to the specific target DNA sequence 
may be used. 

Especially, for example, genes that endow the host with an ability to 
grow on a selective medium are useful. For example, in yeast, use of the 
yeast LEU2 gene as a reporter gene in strains that normally lack LEU2 
allows such yeast to grow on leucine as a sole carbon source. Expression 
the reporter gene is monitored by merely observing whether the host 
possesses the ability to grow on leucine. In a similar manner, use of the 
sua gene as a reporter gene would allow growth of the a suc2* yeast host 
on sucrose to be used as the detection method. In both examples, growth 
on the indicated substrate could be used to indicate specific DNA binding 
of the oligomer of interest and lack of such growth could be used to 
indicate lack of binding or lack of oligomer formation. 

In another example, a construct (and host) which is gall + gallV 
would respond to galactose in the medium; a construct (and host) which is 
lac2 + gaH + would be lactose sensitive. Other reporter genes include his3, 
uraZ and trp5. One of ordinary skill in the art can imagine many other 
appropriate reporter systems which would reveal the presence or inhibition 
of DNa binding or biological activity of the oligomer of interest 

Reporter constructs in which the desired DNA sequence motif and 
the toeZ reporter gene are operably linked will express 0-galactosidase in 
response to binding of a c-Myc activity induced oligomer binding to such 
DNA sequence. Such expression can be easily scored by monitoring the 
ability of the host to produce 0-galactosidase (Maniatis, T. etal., 
Molecular Cloning (A Laboratory Manual), 2nd edition, Cold Spring 
Harbor Laboratory, 1989). The production of 0-galactosidase may be 
visually monitored by detecting its activity to reduce the chromophoric dye, 
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X-gal (commercially available from International Biotechnologies, Inc., 
New Haven, CT). £-galactosidase reduces X-gal to a form which 
possesses a blue color. In another embodiment, the coding sequence of 
chloramphenicol acetyltransferase (CAT) is used as the reporter gene. 

5 Any detection method that can identify expression of the reporter 

gene may be used. For example, levels of the product of the reporter gene 
may be directly assayed with an immunoassay. Such immunoassays 
include those wherein the antibody is in a liquid phase or bound to a solid 
phase carrier. In addition, the reporter gene can be detectably labeled in 

10 various ways for use in immunoassays. The preferred immunoassays for 
detecting a reporter protein using the include radioimmunoassays, enzyme- 
linked immunosorbent assays (HI ISA), or other assays known in the art, 
such as immunofluorescent assays, chemiluminescent assays, or 
bioluminescent assays. 

15 In an assay to screen for the ability of a compound to alter binding 

of the oligomer of interest, yeast strains that express such the desired 
peptide or peptides and which contain the related DNA binding sequence 
motif, may be plated and grown as lawns and the compound to be tested 
may be applied to the plates on a filter paper disk that is impregnated with 

20 such compound. Alternatively, the compound may be incorporated into the 
media within which the host cells are growing. 

One may be able to detect the ability of a compound to alter c-Myc 
activity by the appearance of a zone, which often resembles a halo, around 
the compound-impregnated disk. If for example, the compound is toxic to 

25 the host* s survival per se> the host will not grow in the zone containing the 
compound. 

The methods of the invention can be used to screen compounds in 
their pure form, at a variety of concentrations, and also in their impure 
form. The methods of the invention can also be used to identify the 
30 presence of such inhibitors in crude extracts, and to follow the purification 
of the inhibitors therefrom- The methods of the invention are alio useful in 
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thc evaluation of the stability of the inhibitors identified as above, to 
evaluate the efficacy of various preparations. 

The permeability of cells to various compounds can be enhanced, if 
necessary, by use of a mutant cell strain which possess an enhanced 
5 permeability or by using compounds which are known to increase 

permeability. For example, in yeast compounds such as polymyxin B 
nonapeptide may be used to increase the yeast's permeability to small 
organic compounds. In cells from the higher eukaryotes, dimethyl sulfoxide 
(DMSO) may be used to increase permeability. Analogs of such 
10 compounds which are more permeable across yeast membranes may also be 
used. For example, dibutyryl derivatives often display an enhanced 
permeability. 

In a preferred embodiment, the genetic constructs and the methods 
for using them are utilized in eukaryotic hosts, and especially in yeast, 
15 insect and mammalian cells. The introduced sequence is incorporated into 
a plasmid or vector capable of either autonomous replication or integrative 
activity. 

The sequence of c-Myc is known (Battey, J. etal.. Cell 34:779-787 
(1983)) and probes which are capable of identifying a c-Myc clone are 
20 commercially available (New England Nuclear/DuPont Biotechnology 
Boston, MA). 

The DNA sequence of the desired gene may be chemically 
constructed if it is not desired to utilize a clone of the genome or mRNA as 
the source of the genetic information. Methods of chemically synthesizing 

25 DNA are well known in the art (Oligonucleotide Synthesis, A Practical 

Approach, MX Gail, ed., IRL Press, Washington, D.C., 1094; Synthesis 
and Applications of DNA andBNA, S.A. Narang, ed., Academic Press, 
San Diego, CA, 1987). Because the genetic code is degenerate, more than 
one codon may be used to construct the DNA sequence encoding a 

30 particular amino add (Watson, J.D., In: Molecular Biology of the Gene, 
3rd edition, W.A. Benjamin, Inc., Menlo Park, CA, 1977, pp. 356-357). 
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To express the recombinant constructs of the invention, 
transcriptional and translational signals recognizable by the host are 
necessary. A cloned protein encoding DNA sequence, obtained through the 
methods described above, (preferably in a double-stranded form), may be 
5 operably-linked to sequences controlling transcriptional expression in an 
expression vector, and introduced, for example by transformation, into a 
host cell to produce recombinant proteins useful in the methods of the 
invention, or functional derivatives thereof. Such techniques are well 
known in the art {Recombinant DNA Methodology, Wu, R. ex al. 9 eds., 
10 Academic Press, (1989); Maniads, T. et aL; Molecular Cloning (A 

Laboratory Manual), second edition, Cold Spring Harbor Laboratory, 
1989). 

Transcriptional initiation regulatory signals can be selected which 
allow for repression or activation of the expression of the c-Myc construct 

15 or construct of the recombinant C2 complex peptide (or the C2* peptide), 
or both, so that expression of such constructs can be modulated, if desired. 
Of interest are regulatory signals which are temperature-sensitive so that by 
varying the temperature, expression can be repressed or initiated, or are 
subject to chemical regulation, for example, by a metabolite, salt, or 

20 substrate added to the growth medium. 

Where the native expression control sequences signals do not 
function satisfactorily in the host cell, then sequences functional in the host 
ceil may be substituted. 

Expression of the constructs of the invention in different hosts may 

25 result in different post-translational modifications which may alter the 

properties of the proteins expressed by these constructs. It is necessary to 
express the proteins in a host wherein the ability of the protein to retain its 
biological function is not hindered. Expression of protons in yeast hosts is 
preferably achieved using yeast regulatory signals. The vectors of the 

30 invention may contain operably-linked regulatory elements such as 
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upstream activator sequences in yeast, or DNA elements which confer 
species, tissue or cell-type specific expression on an operably-linked gene. 

Li general, expression vectors containing transcriptional regulatory 
sequences, such as promoter sequences and transcription termination 
sequences, are used in connection with a host. These sequences facilitate 
the efficient transcription of the gene fragment operably-linked to them. 
In addition, expression vectors also typically contain discrete DNA 
elements such as, for example, (a) an origin of replication which allows for 
autonomous rephcation of the vector, or, elements which promote insertion 
of the vector into the host's chromosome in a stable manner, and (b) 
specific genes which are capable of providing phenotypic selection in 
transformed cells. Eukaryotic expression vectors may also contain elements 
which allow it to be maintained in prokaryotic hosts; such vector are 
known as shuttle vectors. 

The precise nature of the regulatory regions needed for gene 
expression will vary between species or cell types and there are many 
appropriate expression vector systems that are commercially available. 

In a highly preferred embodiment, yeast are used as the host cells. 
The elements necessary for transcriptional expression of a gene in yeast 
have been recently reviewed (Struhl, K. Ann. Rev. Biochem. 58:1051-1077 
(1989)). Li yeast, most promoters contain three basic DNA elements: (1) 
an upstream activator sequence (UAS); (2) a TATA element; and, <3) an 
initiation (I) element Some promoters also contain operator elements. 
Methods in yeast genetics are well known (Struhl, K. Nature 505:391-397 
(1983); Sherman, et oL. Methods in Yeast Genetics, Cold Spring Harbor 

Laboratory (1983)). 

In another embodiment, mammalian cells are used as the host cells. 
A wide variety of transcriptional and translational regulatory signals can be 
derived for expression of proteins in mammalian cells and especially from 
the genomic sequences of viruses which infect eukaryotic cells. 



SUBSTITUTE SHEET 



WO 93/08701 PCI7US92/08603 



Once the vector or DNA sequence containing the construct(s) is 
prepared for expression, the DNA constructs) is introduced into an 
appropriate host cell by any of a variety of suitable means, for example by 
transformation. After the introduction of the vector, recipient cells are 
5 grown in a selective medium, which selects for the growth of vector- 
containing cells. Expression of the cloned gene sequences) results in the 
production of the protein. This expression can take place in a continuous 
manner in the transformed cells, or in a controlled manner. 

Genetically stable transformants may be constructed with episomal 
10 vector systems, or with integrated vector systems whereby the fusion 

protein DNA is integrated into the host chromosome. Such integration may 
occur dt novo within the cell or be assisted by trarisformation with a vector 
which functionally inserts itself into the host chromosome, for example, 
with retroviral vectors, transposons or other DNA elements which promote 
IS integration of DNA sequences in chromosomes. 

Cells which have been transformed with the DNA vectors of the 
invention are sglm^l by also introducing one or more markers which allow 
for selection of host cells which contain the vector, for example, the 
marker may provide biocide resistance, e.g., resistance to antibiotics, or 
20 heavy metals, such as copper, or the like. 

The transformed host cell can be fermented or cultured according to 
means known in the art to achieve optimal cell growth, and also to achieve 
optimal expression of the cloned protein sequence fragments. As described 
hereinbelow, a high level of recombinant protein expression for the cloned 
25 sequences coding for .the proteins can be achieved according to a preferred 
procedure of this invention. 

The methods of the invention are not intended to be limited to c- 
Myc and possess utility for the characterization of inhibitors against any 
Myc protein, such as, for example, N-Myc and L-Myc. The C2 complex 
30 peptides of the invention may interact with more than one Myc protein and 
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the C2' complex peptides of the inventions may fonn as the result of the 
activity of more than one Myc protein. 

The following examples further describe the materials and methods 
used in carrying out the invention. The examples are not intended to limit 
5 the invention in any manner. 

EXAMELES 

Example I 

medals and Methods 

ge]l -TTfl n" My- The SA cell line was maintained in 

10 spinner culture under selection with 80 uM methotrexate. Protein 

purification started with roughly 6 liters of cells at SxloVml grown up 
without selection. Heat shock promoter induction was achieved by 
tesuspension in preheated fresh media (43«Q at 1/3 the original volume. 
Cells were incubated with stirring at 43°C for 1 h. To allow translation of 
15 the accumulated mRNA, cells were transferred to 37°C culture conditions 
for3h. Cells were then subjected to the purification described below. 

The baculovirus overexpression vector was constructed by insertion 
of the BamHl/Bcll fragment of pGEMMycB [Halazoneus and Kandil, 
Proc Natl Acad. Sd. USA S&6162-6166 (1991)] into the BamHi site of a 
20 baculovirus expression vector, pVL941, obtained from the laboratory of 
Max Summers (Texas A&M University, College Station, Texas). The 
resulting plasmid contained the entire coding sequence of the human Myc 
gene including 6 nucleotides 5' of the initiation codon and 3' untranslated 
sequence extending to the genomic Rsal site, Sf9 cells were grown and 
25 infected with recombinant baculovirus according to the methods of 

Summers [Summers and Smith, A Manual of Methods for Baculovirus 
Vectors and Insect Cell Culture Procedures, Texas Agricultural Experiment 
Station Bulletin No. 1555} with minor changes. Cells were passaeed in 
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spinner culture and plated on 150 mm diameter tissue culture plates for 
protein production. Cells were infected and harvested approximately 48 h 
post infection by scraping. Cells were then washed in PBS and subjected 
to the purification described below. 

The Protein A-c-Myc fusion protein was expressed in the E. coli 
AR68 strain from a previously published pRTT2T vector [Dang, C.V., 
Anal Biochem. 174:313-317 (1988)) which fused the Ig binding portion of 
protein A to either amino acids 353-439 or amino acids 372-439 of c-Myc. 
Growth and induction of the cells was as per Dang ex at [Anal. Biochem. 
774:313-317(1988)). 

Protein Purification : All purification steps were carried out on ice or with 
ice cold buffers unless otherwise stated. Cells may be used fresh or stored 
quick frozen in liquid nitrogen for larger batch preparations. 5A or Sf9 
cells were washed in phosphate-buffered saline (PBS) and resuspended at 
2.1xl0 7 cells/ml in Low Salt Lysis Buffer (20 mM HEPES pH 6.8, 5 mM 
KC1, 5 mM MgCl 2 , 0.5% NP40, 0.1 % Na-deoxycholate, 1 tig/n& 
aprotinin, and 0.1 mM PMSF) [Evan and Hancock, Cell 43:253-261 
(1985)]. After 10 min cells were subjected to 40 strokes in a Dounce 
homogenizer with a type A pestle. Nuclei were pelleted at lOOOxg, 5 min, 
4°C, washed once in 50 ml Low Salt Lysis Buffer, resuspended at 2.5x10 s 
nuclei/ml in Low Salt Lysis Buffer containing 50 /ig/ml DNAse I and 
incubated at 4°C for 1 h. An equal volume of ice cold 2X High Salt 
Buffer (2x concentrations: 20 mM Tris, pH 7.4, 4 M NaCl, 1 mM MgCl 2 , 
and 0.1% NP40) [Evan and Hancock, Cell 43:253-261 (1985)] was then 
added, mixed gently, and incubated for 10 min. The residual nuclear 
material (including the c-Myc protein) was pelleted (2000xg, 10 min, 4°C) 
and resuspended for solubilization at 5.5xl0 7 nucleus equivalents/ml in 
Buffer A (50 mM Tris, pH 8.0, 2 mM EDTA, 5 % glycerol, .1 mM DTT, 
and .1 mM PMSF) [Watt et aL, Mol. Cell. Biol. 5:448-456 (1985)] 
containing 5 M urea (referred to as 5 M urea Buffer A) achieved by 
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dilution of a freshly ddonized stock of6Murea. This and all buffers used 
on columns were passed through 0.2 pore „m filter units. Residual nuclei 
were solubilized by vigorous stirring on ice for 30 rnin. This protein 
solution was centrifugcd (10 min, 5000xg, 4.Q to pellet any insoluble 
material prior to chromatography. The supernatant was loaded on a 10 ml 
DEAE Sepharose CL-6B (Pharmacia) column equilibrated with 5 column 
volumes of 5 M urea Buffer A. Sample loading was at 0.1 rnl/min and 
column washing and elution were at 0.4 ml/min. After loading, the 
column was washed with 3 volumes 5 M urea Buffer A containing no 
additional salt followed by 4 volumes of the same buffer containing 0.1 M 
NaCL Myc protein was ehited in the following elution step at 0.35 M 
Nad. The protein containing fractions of this 0.35 M Nad step were 
pooled and diluted with fresh 5 M urea Buffer A to 0.1 M Nad and loaded 
onto a 1 ml FPLC Mono-Q column (Pharmacia) run at 0.5 ml/mm. The 
Mono-Q column was eluted with a programmed gradient of 5 ml spanning 
0 10 M Nad to 0.35 M Nad followed by a 2 M Nad step. For 
enhanced purity the gradient was held manually at approximately 0.19 M 
until the major contaminating protein finished eluting as determined by an 
in line UV monitor. In the initial development of the purification protocol 
fractions from the columns were assayed for Myc by slot blotting followed 
by visualization using the 1F7 monoclonal antibody and ^-labeled 
secondary antibody. For later preparations silver staining of SDS-PAGE 
allowed sufficiently unambiguous identification of the Myc proteins and 
provide an assessment of the purity of given fractions. The Myc 
containing fractions were pooled based on purity and dialyzed against 
buffer containing 20 mM Tris, pH 7.8, 50 mM Kd, 10 % glycerol, 0.1 
mM DTT and 0.1 mM PMSF (referred to as Dialysis Buffer) in bags of 
SpectroPor 2 membrane for 3 changes, 2 liters each, for a minimum of 3 h 
each Pools of fractions prepared this way contained CI and C2 (and C2') 
binding activities. To obtain pure CI binding activity the Myc-containing 
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Mono Q fractions were assayed by EMS A and those free of C2 binding 
activity were pooled and dialyzed separately. 

The bacterially produced Protein A-c-Myc fusion protein was 
partially purified by differential ceatrifugation and solubilized in 5 M urea 
5 according to Watt ex al. [Bagchi ex al , Mol. Cell. Biol. 7:4151-4158 

(1987)] with die following minor modifications: Protease inhibitors were 
present in the initial lysis buffer (10 ng/tol pepstatin, 1 mM PMSF, 50 
jig/ml aprotinin, 2 pg/ml leupeptin, 10 mM Na-metabisulfite, and 1 mM 
benzamidine) and cells were sheared by 6 bursts of 15 s each in a Cuisinart 
10 MiniMate on ice. The urea solubilized material was cleared of insoluble 
material by centrifugation (10,000xg, 10 min, 4°Q and dialyzed into 
Dialysis Buffer containing 0.5 mM DTT. Precipitated material was 
removed by centrifugation (15,000xg, 20 min, 4°C). Protein A-Myc 
fusion protein was purified from the supernatant by IgG affinity essentially 
15 according to Nflsson et al. [EMBO J: 4:1075-1080 (1985)]. A 1 ml aliquot 
of supernatant was incubated with 0.1 ml of a 50% slurry of IgG Sepharose 
6 fast flow (Pharmacia) rocking for 1 h at 4°C. The pellet was washed 
twice with Buffer A and the fusion protein eluted with 0.3 M lithium 
diiodosalicylate (US). The eluate was then dialyzed extensively to remove 
20 the LIS (initially against Buffer A at room temperature to avoid LIS 
precipitation, then against Dialysis Buffer 4°Q. The two bacterially 
expressed Myc preparations were compared by Coomassie staining of SDS- 
PAGE to ensure that equal amounts of the fusion proteins were used for 
experiments. 

25 M£fan miBl Sequencing: The. 3 bands of purified Myc from 5 A cells were 
individually isolated by electroelution according to Hunkapiller ex al. 
{Mexh. Em. 97:227-236 (1983)]. Preparative SDS-PAGE was carried out 
and protein bands excised after visualization with Coomassie Brilliant Blue 
R-250. Alter electroelution the material was precipitated 2 times with 

SUBSTITUTE SHEET 



BNSDOCID: <WO, 



9308701A1_1_> 



WO 93/08701 



PCT/US92/08603 



-34- 



mcthanol/acetone and submitted for N-terminal sequencing by Edman 
degradation. 



AnjfcQdjfis: The monoclonal antibody, 1F7 (a generous gift of R. 
Chizzonite, Hoffman LaRoche), is directed against the peptide sequence 
5 comprising amino acids 305-317 in murine c-Myc [Miyamoto et cL, Proc. 
Nad. Acad. ScL USA 82:7232-7236 (1985)]. The antibody directed against 
cl was monoclonal 51F rBreyer and Saner, /. Biol. Chem. 264:1334*- 
13354 (1989)] which had been purified by ammonium sulfate precipitation 
and chromatography on QAE Sephadex. 

10 ^ ^.Mr tim ?m ***** EMSAl: Radiolabeled probes were 
produced via a Klenow fill in of annealed oligonucleotides containing 4 
base 5' overhangs at each end (see table below for sequences). Binding 
reactions took place in a final volume of 20 id containing 2 ng of labeled 
probe, 125 ng poly d(IQ, an indicated amount of protein, and the 

15 following final buffer conditions: 10 mM Tris, pH 7.5, 50 mM KG, 0.1 
mMEDTA, ImMDTT, 1 mMMgCl 2 and5% glycerol. Binding 
reactions were allowed to proceed for 20 min at room temperature and 
were then loaded immediately on a 4% polyacrylamide gel which had been 
prexunal least lh at lOV/cm. Hectrophoresis was for 1.5 h at lOV/cm in 

20 0-5x TBE. 

o,t»nd denature: The method of Bagchi et al. [Bagchi ex al. , MoL Cell. 
Biol. 7:4151-4158 (1987)1 was followed except for the final dialysis step. 
Precipitated protein samples containing BS A as carrier protein were 
solubilized in 6 M guanidine-hydrochloride (200 A unless otherwise 
25 indicated) according to Bagchi ex cl \Mol Cell Biol 7:4151-4158 (1987)]. 
Direcdy prior to analysis by EMSA the samples were subjected to dialysis 
alone or in combination with another sample in a total volume of 15 jd- 
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Equal volumes of each sample were used in a given experiment and the 
volume was brought to 15 fd using 6 M GuHCl containing 0.1 mg/ml 
BSA. Dialysis was against 40 ml of Dialysis Buffer carried out for 1 h at 
4 o C on floating 13 mm membrane discs (MUlipore #VSWP-013, pore size 

0.025 fim). 

Election fa rm ?flmfr™ *he following procedure was 

devised based on the method of Pollock and Treisman [NucL Adds Res. 
J8:6197-6204 (1990)]. A 52 base oUgonucleotide "randomer" (see table 
below) was annealed to the following 16 base primer: Xho I primer 5' 
CCGATATCTCGAGACGG 3', [SEQ ID No. 4]. The annealed primer 
was extended using Klenow and nucleotides (0.2 mM cold dNTPs and 0.4 
t&L cr^P-dCTP 8000/mmol) to create a pool of double stranded probes 
representing approximately 4 20 sequences. The initial round of binding site 
selection by EMSA utilized 200 ng of this pool and either 0.37 /ig of 
baculovirus produced c-Myc or 0.5 /xg of CHO produced c-Myc. Other 
parameters were as previously described for EMSA. Lanes containing 
randomer probes were alternated with reference lanes containing 2 ng 
(USE) 3 probe and 0.37 M g of baculovirus c-Myc. The completed EMSA 
gel was electroblotted onto NA45 membrane (200 mA, 2.5 hrs) and the wet 
membrane was wrapped in plastic wrap and exposed for at least 1.5 hrs. 
The regions of the randomer lanes corresponding to the visible CI and C2 
complexes of the reference lanes were excised and eluted with 100 of 
elution solution (lo'mM Tris, pH 8.0, 1 mM EDTA, I'M NaCI) 30 min at 
68*C. The liquid was transferred to a fresn tube and the membrane was 
rinsed with 100 /d TE which was added to this eluate. After pelleting the 
particulate debris, the DNA was precipitated with the addition of 10 M g 
glycogen, 2 fd 1 M MgCl 2 and 2.5 volumes of ethanol. The pellet was 
rinsed with 70% ethanol, dried, and the recovery assessed by scintillation 
counter. The entire pellet of each sample (-29-57 pg) was resuspended in 
10 jd 10x PCR buffer (500 mM KC1, 100 mM Tris, pH 8.4, 1 mg/ml 
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gelatin, 15 mM MgCy and 32 ^1 water. After addition of 1 fJ each of 
100 fiM Xho I primer and Xba I primer (5' GGACGATCTAGATTCG 3', 
[SEQ ID No. 5]), 5 jd of nucleotide mix (2 mM dNTPs and 4 o? 2 P- 
dCTP 8000/mmol), and 1 UTaq polymerase the reactions were overlaid 

5 with paraffin oil and subjected to 20 cycles of PCR in an Ericomp 

machine: 2 min 94*0, 20x (15 sec 95°C, 15 sec 55°Q, 10 min 72°C. 
The products were gel purified on 10% acrylamide and precipitated using 
. 10 fig glycogen as canier. Recovery was measured by scintillation counter 
and after resuspension in the EMSA reaction buffer (10 mM Tris, pH 7.5, 

10 50 mM KQ, 1 mM EDTA, 1 mM MgQ 2 , and 5% glycerol) this probe 
was used for the next round of EMSA selection. Subsequent cycles were 
primarily as above, however, 50 ng of probe was used. Eight rounds of 
selection and amplification were completed for the baculovinis c-Myc and 
seven rounds for the CHO c-Myc. After the final PCR reaction the 

15 products were extracted twice with phenol, twice with ether, and 

precipitated prior to digestion with Xho I and Xba I. After gel isolation the 
appropriate fragment was subcloned into the Bluescript SK vector 
(Stratagene) and sequenced by standard procedures. 

r ^nnrfftntides Used : OHgonucleotide sequences that were used are 
20 shown below, with the E-Box core sequences underlined: 

SEQ ID NO. 6: 

(|lE2) 3 5' GATCTCTGC^GCAGCT^CAGCAGCTGGCAGCAGCIGGCG 3'; 



SEQ ID NO: 7: 
0iE3) 3 5' g; 



lATCTGCAGTC^IGTGGCGTCAIGJJGG CGTCATGTGG CAG 3 ' ; 



25 SEQ ID NO: 8: 

(USE) 3 5' GATCTGCAGTCACGTCGCGTCACGTCGCGTCACGTGGCAG 

SEQ ID NO. 9: 
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MLC-A 5' TCGACGTCGCAGCAGGTGCAG 3'; 
SEQ ID NO. 10: 

MLC-B 5' TCGACCCCACCAGCTGGCGAG 3'; 

t 

SEQ ID NO. 11: 

ERP1/2 5' AGCTTeGA ACACCTG CA GCAGCTGG CAGGAAGCAGGCCTA 3 
SEQ ID NO. 12: 

ERP3/4 5' AGCTTTAAAATCCCCAC£AjG£2Sg6gAAGCAACAGGTGCA 3 
SEQ ID NO. 13: 

HSE 5' AATTGCGAAACCCCTGGAATATTCCGACCTGGCAGCCTC 3'; 
SEQ ID NO. 14: 

SMS 5' TCGACTTTAGACCACGEGGTCCCCTCGA 3'; 
SEQ ID NO. 15: 

Randomer 5' GGACGATCTAGATTCG(N) 20 C CGTCTCGAGTATCGG 3' 

Example 2 
Purification of c-Mvc Protein 

A primary goal of this work was to purify and characterize Myc 
from a mam malian source. An inducible mammalian overexpression 
system that has been described previously was utilized (Wurm ex aL , Proc. 
Natl. Acad. Jci. USA 53:5414-5418 (1986)). Briefly, the two coding exons 
of the mouse c-Myc gene under the control of a Drosophila heat shock 
promoter had been integrated and amplified in the genome of a Chinese 
hamster ovary (CHO) cell line. This overexpressing cell line, 5A, was 
adapted to spinner culture. Heat shock (43°C) induces transcription of the 
amplified myc genes while a subsequent 2 hour recovery period at normal 
•growth temperature <37°Q permits frandarinn The resulting graductt .wete 
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phosphoproteins of 60, 62, and 72kD which were immunoprecipitable with 
Myc-specific monoclonal antibodies (Wurm et al. t Proc Natl. Acad. ScL 
USA $3:5414-5418 (1986)). The c-Myc produced was tightly associated 
with the nuclei and attempts to solubilize it using a number of detergents, 
5 salts, and reducing agents were unsuccessful (data not shown). Significant 
solubilization was achieved however with either SDS or with urea at 
concentrations greater than 4 M. For purification, the Myc was solubilizcd 
with 5 M urea and chromatographed on DEAE resin and FPLC Mono-Q as 
described in materials and methods. The presence of Myc in the column 
10 fractions was assayed by immunoblot using an antipeptide monoclonal 

antibody, 1F7 (Miyamoto et aL, Proc. Natl Acad. ScL USA 52:7232-7236 
(1985)). This purification procedure yielded 150jtg of c-Myc per liter of 
spinner cells (8x10 s cells). The Myc appeared to be 95% homogeneous as 
judged by silver staining (Fig. 1A). 
15 ^ alternative translation start site for c-Myc accounts for some of 

the molecular weight heterogeneity of c-Myc translated in vitro and 
expressed in several cell lines (Harm et «L. Cell 52:185-195 (1988)). This 
alternate site is upstream from the canonical start site, however, and is not 
present in our overexpressor gene. N-terminal sequence analysis of each of 
20 the three prominent Myc bands described above revealed, as expected, the 
sequence predicted by the canonical start site (data not shown), although 
the N terminal methionine was not present, presumably because of N 
terminal processing. Therefore the potentially important differences in 
apparent molecular weight that are observed might be attributed to post- 
25 translation^ modifications and not N-terminal heterogeneity. 

Human c-Myc has also been purified using the baculovirus 
overexpression system. For purification, Sf9 cells that had been infected 
with recombinant virus were harvested just prior to the onset of lysis (-48 
hours post infection). Myc produced using the baculovirus system has been 
30 previously reported to be both phosphorylated and tightiy associated with 
the nucleus (Miyamoto et aL, MoL Cell. Biol 5:2860-2865 (1985)). 

SUBSTITUTE SHEET 



BNSDOCID: <WO 9308701 A 1 J_> 



WO 93/08701 



PCT/US92/08603 



-39- 

Solubilization and purification were carried out as with the CHO produced 
Myc resulting in a yield of 2.5 mg/8xl0 8 cells. Myc purified from these 
insect cells was apparently homogeneous by silver staining, and ran as a 
single diffuse band of — 60kD (Fig. IB). This was in contrast to the 
5 multiple bands observed with mammalian Myc by immunoblot (Fig. IB). 

Discussion 

Myc was purified to near homogeneity from overexpressing 
mammalian cells and baculovirus infected cells. The mammalian derived 
protein appears to be highly modified in contrast to Myc expressed in and 

10 purified from insect cells. Up to 19 distinct species of c-Myc can be 

identified by two dimensional gel electrophoresis (Fig. 1). These species 
differ both in size (approximate MRs of 60,000, 62,000 and 72,000, 
although this estimate of size can vary with different gel conditions) and in 
pi. These differences in pi might in part be attributed to differences in 

15 phosphorylation, as c-Myc is known to be phosphorylated and the change 
in pi of the species is consistent with incremental additions of phosphate. 
Although the Myc produced by the baculovirus overexpression system does 
not demonstrate the same molecular weight heterogeneity as the mammalian 
protein, it too is phosphorylated (Miyamoto et aL, Mol. Cell. Biol. 5:2860- 

20 2865 (1985)). The specific sites of phosphorylation have not been 
determined for either Myc preparation and other as yet unidentified 
modifications may distinguish these two Myc preparations. 

Example 3 

Specific DNA Rinding A rtivitv Present in Purified c-Mvc 

25 The presence of a B-HLH domain in c-Myc suggested that it would 

bind to an E-Box-like sequence of the general pattern CANNTG. These 
sites were first identified in immunoglobulin enhancers but have since been 
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found in many other tissue specific enhancers. It was first determined if 
any of these would be bound by the purified c-Myc proteins described in 
Example!. A large number of E box related sequences were screened by 
electrophoretic mobility shift assays (EMSA). Those shown in Fig. 2 
5 include synthetic oligonucleotides containing trimers of either the /iE2 

(CAGCTG) or /*E3 (CATGTG) sites of the immunoglobulin enhancer and a 
trimer of the Adenovirus major late promoter upstream element (USE) 
(CACGTG). Two sites from the myosin light chain (MLQ enhancer are 
also shown: the A site (CAGGTG) which resembles the kE2 
10 immunoglobulin enhancer site, and the B site (CAGCTG) which has the 
same core sequence as the ,<E2 site. The heat shock element (BSE) served 
as a control since its sequence does not resemble an E-Box core. 

Three specific binding activities were detected in this assay forming 
complexes referred to as CI (USE specific), C2 (USE specific), and C2< 
15 (^specific). As demonstrated below, despite the emigration of C2 and 
C2', these represent separate complexes based on observed differences in 
protein composition as well as binding specificity. The data presented 
argue that the CI complexes are formed by homo-oligomers of Myc while 
formation of the C2 and C2' complexes each require an additional protein, 
20 The slowly migrating complex (CI) formed most readily on the USE (Fig. 
2, lanes 5, 11, and 12), less well on the similar **E3 site (Fig. 2, lanes 2 
and 8), and not at all on the other E-Box and non-E-Box sites tested. CHO 
and baculovirus Myc preparations were similar with regard to the CI 
complex, however they differed with regard to the fester migrating 
25 complexes. In the mammalian Myc assays the C2' complex formed on the 
^E2 site of the immunoglobulin enhancer and the is ,iE2-like sequence of 
the MLC-B site (Fig. 2, lanes 1 and 4). Baculovirus Myc contained no 
binding activity with this specificity (Fig. 2, lanes 7 and 10). In contrast, 
formation of the C2 complex was detected using either Myc preparation. 
30 The C2 complex formed most readily on the USE site (Fig. 2, lanes 5, 11, 
and 12) and less well on the similar fJZ sequence (Fig. 2, lanes 2 and 8). 
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Very lutle if any binding was detected on the kE2-Uke sequences (MLC-A 
Fig. 2, lanes 3 and 9, and mE5, data not shown). No specific binding was 
found on non-E-Box sequences such as the HSE (Fig.2, lane 6 and 13). 

Competition experiments were performed on the three binding 
activities CI, C2, and C2' to further characterize their specificity (data not 
shown). In experiments using /JE2, p£S, USE, ,iE5, or HSE sequences as 
competitors, competition of me C2 ' complex formed on the |«E2 probes 
was most easily achieved with the /xE2 oligos while the C2 complexes were 
preferentially competed by the USE sequence. The CI complex was also 
competed most efficiently by the USE sequence. A detailed analysis of the 
binding specificities of these complexes is presented below. 

Example 4 

P T^in* PP^nnrihle f ?r Formation nf PI. C2. and C2' Complexes 

One scenario suggested by the differences in binding is that Myc 
might not be the only protein involved in formation of the three complexes. 
To distinguish the role of c-Myc from other copurifying proteins in the 
formation of the observed complexes cut and renature experiments were 
performed as follows. Preparative amounts of Myc were separated by 
SDS-PAGE. Proteins were electroeluted from various molecular weight 
dices, precipitated, solubilized in guamdine-hydrochloride and dialyzed to 
renature for analysis by EMSA. The CI complex binding activity may be 
renatured from the Myc containing slices of either baculovirus or 
mammalian preparations (Fig. 3) while no other slices from the entire gel 
contained CI activity (data not shown). These data argue that Myc alone is 
the protein responsible for the CI complex, and that Ml length Myc 
protein as expressed in eukaryotic cells can bind specifically to sites with 
the core sequence CACGTG. 

Analysis of the proteins responsible for formation of the C2 and 
02' complexes was achieved with additional cut and renature experiments 
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perfonned as described above. EMSA using the USE probe revealed no 
single slice from CHO or baculovirus preparations which contained 
detectable C2 binding activity (data not shown). However, this activity 
was recovered by renaturing proteins from a 26-29 kD dice together with 

5 proteins in the 60-70 kD Myc containing dice (Fig. 4, lanes 1-8). The 26- 
29 kD component was present in gels loaded with either CHO or 
baculovirus produced c-Myc, and, when xenatured with Myc, demonstrated 
the same specificity as the C2 complex in the loaded material. Renaturation 
of the 26-29 kD slice with BS A or protein A did not yield USE binding 

10 activity suggesting that Myc plays a specific role in the recovery of C2 
binding activity. 

To examine further the rotes of copurifying proteins and of Myc 
modifications in the observed binding, Myc was also purified from a 
bacterial overexpression system. The expression system and purification 
15 method used were those of Chi Dang and colleagues (see materials and 
methods). The bacterially produced protein contains the IgG binding 
segment of protein A fused to the C-tenninal 85 amino acids of Myc, the 
segment of Myc which contains the B-HLH and leucine zipper motifs. For 
many of the B-HLH proteins, the small region of the protein containing the 
20 B-HLH motif is not only necessary but fully sufficient for DNA binding if 
the correct ohgomerization partner is present. This protein was able to 
form the CI complex on the USE probes (Fig. 4, lane 9) and to combine 
with the 26-29 kD factor to create the C2 complex (Fig. 4, lane 10). 
Competition experiments confirmed the specificity of this reconstituted C2 
25 complex. The CI and C2 complexes formed using this bacterial fusion 
protein migrated more rapidly than those formed using full length c-Myc 
(compare Fig. 4, lane 8 with lanes 9 and 10). This may be due to the 
difference in size between the full length c-Myc (60-72 kD) and the protein 
A-Myc fusion protein (-38 kD) and therefore the mobility of the C2 
30 complex may be interpreted as an indication that Myc is physically present 
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in the C2 complex, presumably as part of a hetero-oligomer with the 26-29 
kD factor. 

Analogous experiments were carried but using a similar bacterial 
fusion protein containing only the C-terminal 67 amino acids of c-Myc. 
5 This protein contains most of the HLH domain and the entire leucine zipper 
domain but no basic region. Although this protein is capable of forming 
homo-oligomers in solution (Gentz et aL, Science 245:1695-1699 (1989)), 
it was unable to bind to DNA to form the CI complex and was also unable 
to combine with the 26-29 kD factor to create any USE binding activity 
10 (Kg. 4, lane 12). These data argue that the role of Myc in the C2 hetero- 
oligomer requires an intact basic region, the region responsible for specific 
DNA contacts in other B-HT.H proteins. 

Using cut and renature experiments the /iE2 binding activity 
responsible for the C2 ' complex was able to be identified. A small amount 
15 of the C2' complex was frequently seen with proteins from the slice 

encompassing the 40-50 kD molecular weight range of mammalian Myc 
preparations (Fig. 5 A). Although no C2' complex was ever seen with the 
Myc containing slice alone, renaturanon of the protein from the Myc slice 
with the 40-50 kD slice reproducibly increased the amount of CI' complex 
20 formed. Both the baculovirus produced Myc and the bacterially expressed 
fusion protein containing the basic region, which do not form complexes 
themselves on fiE2 probes, were also able to increase the amount of 
complex formed by the 40-50 kD slice obtained from mammalian 
preparations (Fig. 5B and Q. Surprisingly the bacterially produced Myc 
25 lacking the basic region could also reconstitute C2 ' activity, while various 
other proteins tried including BSA, immunoglobulins, and protein A could 
not. The apparent lack of a role for this basic region suggests that Myc's 
involvement in formation of this complex may be other than contacting DNA. 
To further determine whether Myc was present in the analyzed 
30 complexes, the Myc preparations were incubated with a Myc-specific 

monoclonal antibody prior to EMSA. The probe used in this experiment 
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(SMS) contained a angle site with the USE core sequence, CACGTG. The 
Myc-specific antibody eliminated both the CI and C2 complexes and 
produced a prominent complex of slower mobility (Fig. 6). It is not clear 
from these data which of the two complexes was supershifted but the 
5 presence of one predominant shifted complex when antibody is present and 
two complexes in the absence of antibody argues that the Myc-specific 
antibody also completely disrupted one of the original complexes. There 
was no effect of a control monoclonal antibody on the formation of either 
the CI or C2 complex. The Myc-specific antibody did not alter the C2 ' 
10 complex, suggesting mat Myc is not present in this complex. 

From these experiments it can be concluded that the CI complex is 
formed by Myc alone, that me C2 complex contains Myc and a 26-29 kd 
factor and that the C2' complex contains a 40-50 kd factor but does not 
contain Myc. It is intriguing that the C2 ' complex requires the presence of 
15 Myc for formation, but apparently does not contain Myc. Myc therefore 
appears capable of affecting the 40-50 kd factor's ability to form the C2 ' 
complex without being a member of the complex. Whatever the 
mechanism, the increase in jdE2 binding activity of the 40-50 kD factor 
appears to be Myc-specific since four different Myc proteins increased the. 
20 amount of C2 ' complex observed while several other proteins did not. 

Max protein can be immunoprecipitated from avian and human cells 
and low stringency Southern analysis has suggested that a single Max gene 
or a small family of genes exist in other vertebrates as well (Blackwood 
and Eisenmann, Science 25i:1211-1217 (1991)). It is possible that hamster 
25 and insect cells have an equivalent of Max. The recovery of a Max-like 
activity from insect cells is particularly interesting since no Myc homologs 
have been found in insects to date. Drosopbila clearly uses B-HLH 
heterodimers to regulate aspects of development and the possibility remains 
that the natural partner for the 26-29 kD protein in insect cells is an as yet 
30 unidentified B-HLH protein which functions like Myc. 
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The presence of the 26-29 kD factor in these preparations might 
limit their usefulness for certain experiments. By pooling Myc containing 
fractions based on an EMSA assay, one may obtain fractions that contain 
only the CI activity and that do not contain the C2 activity, although this 
5 modification reduces the final yield by approximately 80%. 

Example 5 

Selection of Binding Sites For Mvc Fro m Random DNA Sequences 

In order to determine the optimal binding sites for the three 
complexes in the Myc preparations described above, a modification of a 

10 recently described technique for isolating preferred binding sites from large 
pools of randomized DNA sequences was used (Pollock and Treisman, 
NucL Acids Res. 78:6197-6204 (1990)). Briefly, a pool of double stranded 
oligonucleotides was created that consisted of 16 base flanking regions of 
defined sequence surrounding a 20 base region of completely random 

15 sequence. Each of the eukaryotic Myc preparations described above was 
mixed with this pool of sequences and the protein DNA complexes that 
formed were separated by EMSA. The DNA that ran at the position of the 
CI or C2 (and comigrating C2 *) complexes was isolated, amplified by the 
polymerase chain reaction (PCR), and used in a second round of EMSA 

20 selection. Either seven (CHO preparation) or eight (baculovirus 

preparation) rounds of selection in total were performed before subcloning 
individual members of the selected sequences. As each round was expected 
to enrich for better binding sites, the final subcloned oligonucleotides were 
expected to contain high affinity binding sites for the CI, C2, and C2 * 

25 complexes. In addition, such a procedure should give some indication of 
the relative stringency of selection for a given base at a particular position 
within the binding site consensus. 

The selected sequences were placed in three separate groups for 
analysis (Fig. 7). Group I contains sequences that were selected by the CI 
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complex from either mammalian or baculovirus preparations. These 
sequences were pooled for analysis because with both preparations 
formation of the CI complex requires only Myc protein, and because the 
two sets of sequences (that isolated with mammalian Myc and that isolated 
5 with baculovirus Myc) were sirriilar to each other. Most of the selected 
sequences in this group contained the sequence CACGTG (21/27 of 
sequenced subclones). By aligning all of the sequences that contained this 
central core sequence, it was found that the sequences flanking this core 
were also nonrandom. A 12 base consensus sequence of 
10 GACCACGTGCTC [SEQ ID. No. 1] was determined for sites selected by 
the CI complex (see table in Kg. 7 for frequencies at each position; for a 
base to be included in the consensus it had to be found in at least 10 out of 
the 21 sequences with a CACGTG core). The C2 complex from 
baculovirus preparations selected sequences similar to those selected by the 
15 CI complex (Fig. 7, Group H). Most of these selected sequences also 
contained the CACGTG core (19/22). These sequences had similar 
flanking sequences adjacent to the core hexamer to those found with the CI 
complex, although there was a slight preference for GCC over CTC in the 
3 ' flank (see table for Group II in Fig. 7). 
20 As expected, complexes running at the position of C2 that were 

selected by the mammalian Myc preparations had a greater diversity of 
sequences (Fig. 7, Group HI). Several sequences (8/36) contained the 
CACGTG core. These sequences were presumably selected by the 
mammalian C2 complex (comprised of Myc and the 26-29 kd factor) and 
25 demonstrated the same flank preferences as the CI complex. Several other 
selected sites (9/36) contained a CAGCTG core sequence presumably 
selected by the C2' complex. In addition, 8 of the 36 sequences were very 
AT rich, and many of the sequences in all three groups contained AT rich 
stretches. This enrichment for AT rich sequences might reflect a 
30 preference of Myc for these sequences, or instead might simply indicate a 
bias arising from the protocol used. It is interesting to note, however, that 
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in previous filter binding experiments, the mammalian Myc preparation has 
demonstrated a preference for binding AT rich sequences within various 
plasmids or lambda genomic DNA. 

To confirm the validity of our site selection procedure a number of 
5 the selected sites individually by EMSA (Fig. 8) were tested. As expected, 
it was found that sequences containing the core CACGTG formed both the 
CI and C2 complexes (Fig. 8, probe groups 1, 2, 5, and 6) while 
sequences containing the CAGCTG core formed only the C2 * complex 
(Fig. 8, probe groups 7 and 8). Note that the C2' activity is only present 
10 in the CHO derived Myc preparations. No complex formed when selected 
sequences that did not contain a canonical E box core were tested (Fig. 8 
probe groups 3, 4, 9, 10, and 11). These latter sequences, therefore, do 
not represent high affinity sites for proteins in the Myc preparations. 

Example 6 

IS Off Rate Of The CI And C2 Complexes 

Off-rates for the Myc containing complexes were measured as a 
means of comparing their affinities. The off-rate of the C2 complex 
formed on the USE probe was approximately 1-2 minutes (Fig. 9, 
baculovirus Myc; similar results were obtained with CHO Myc, data not 

20 shown). The CI complex was not fully competed in this experiment using 
250 fold excess of USE competitor. Although competition was not 
complete, the amount of CI complex remaining at the earliest measurable 
timepoint ("0") was significantly less than the starting amount and virtually 
equal to the maximum competition achieved in these experiments. These 

25 data are indicative of an abundant weakly binding protein with an 
immeasurably fast off-rate. Therefore Myc alone appears to bind 
significantly more weakly than does Myc and the 26-29 kd factor. 

Example 7 
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Jrientificatinn of an Tn ^^t}T of c-Mvc 
CI. Compter Activity in Yeast Cells 



Yeast host cells axe transformed with plasmids carrying a c-Myc 
expression vector (host 'a*); or the c-Myc expression vector and a 26-29 
5 Mlodalton C2 complex protein identified as above (host 'b')- In addition 
all yeast strains are cotransfbrmed with a plasmid that contains the coding 
sequence for /S-galactosidase operably-linked to the CACGTG sequence 

motif as described above. 

A lawn of each of the transformed yeast strains is spread on agar 
10 plates containing X-gal in the medium and small filter disks containing 

compoimd W, X, Y r or Z are placed on the lawns. The yeast are allowed 
to grow and the plates are monitored for colony growth and colony color 
by visuz 1 observation. Typical results from such an experiment are shown 
in Table 1. 
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Table i: Identification of Inhibitors of C2 Complex Activity 



Compound Yeast Colony Color from 

5 Growth 0-gal Assay 

with X-gal 



none 


a . 


+ . 


White 




b 


+ 


Blue 


W 


a 




White 




b 


+ 


White 


X 


a 








b 






Y 


a 


+ 


White 




b 




Blue 


Z 


a 




Blue 




b 


+ 


Blue 



The results of the above table indicate that compound W prevents 
20 the induction of /3-galactosidease in the *b 1 host cells. Therefore, 

compound W is an inhibitor of C2 complex hetero-oligomer formation and 
an inhibitor of c-Myc biological activity. Compound X inhibits the 
growth of y east per se and thus would not be a compound of interest. 

Compound Y does not prevent induction of 0rgalactosidase activity 
25 in the 'b* host cells. Therefore, compound Y is not an inhibitor of C2 
complex hetero-oligomer formation. 

Compound Z shows an interesting effect of inducing /3-galactosidase 
activity in the 'a* host cells which does not contain the C2 complex protein 
used in the V hosts, rather than preventing hetero-oligomer formation. 
30 This suggests that compound Z may induce synthesis of a partner protein 
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wfaich is not otherwise present in the yeast host cells or that it may be (or 

mimic) such a protein. 

From these results, compound W would be identified as an inhibitor 
of C2 complex formation and/or DNA binding and thus of c-Myc 
transcriptional activity in vivo. 

Example 8 



Pfr^firation ? f a Inhjfeiigr of c-Mvc 

ro ' rnmpl^ ftrrivitv in Yeast Cells 



Yeast host cells are transformed with two plasmids, each plasmid 
10 canying a C2 ' complex expression vector encoding at least one 40-50 
Idlodalton C2' peptide (host V); or the c-Myc expression vector in 
addition to the vectors encoding the C2' complex proteins identified as 
above (host 'b'). In addition all yeast strains are cotransformed with a 
plasmid that contains the coding sequence for ^-galactosidase operably- 
15 linked to the CAGCTG sequence motif as described above. 

A lawn of each of the transformed yeast strains is spread on agar 
plates containing X-gal in the medium and small filter disks containing 
compound W, X, Y, or Z are placed on the lawns. The yeast are allowed 
to grow and the plates are monitored for colony growth and colony color 
20 by visual observation. Typical results from such an experiment are shown 

in Table 1. 
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Table 2: Identification of Inhibitors of C2 ' Complex Activity 



Compound Yeast 



Colony 
Growth 



Color from 
0-gal Assay 
withX-gal 



10 



15 



none 


a 


+ 


White 




b 


4- 


Blue 


W 


a 


+ 


White 




b 


+ 


White 


X 


a 
b 






Y 


a 


+ 


White 




b 




Blue 


Z 


a 


+ 


Blue 




b 




Blue 



The results of the above table indicate that compound W prevents 
the induction of 0-galactosidase in the 'b' host cells. Therefore, compound 
W is an inhibitor of C2' complex hetero-oligomer formation and an 
inhibitor of the c-Myc biological activity that is directed towards promoting 
such C2 ' complex hetero-oligomer formation. Compound X inhibits the 
growth of yeast per se and thus would not be a compound of interest. 

Compound Y does not prevent induction of 0-galactosidase activity 
in the 'b* host cells. Therefore, compound Y is not an inhibitor of C2 
complex hetero-oligomer formation. 

Compound Z shows an interesting effect of inducing 0-galactosidase 
activity in the 'a' host cells which does not contain the Myc protein used 
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in the V hosts, rather than preventing hetero-oligomer formation. This 
suggests that compound Z may induce synthesis of a protein that can 
substitute for Myc in promoting formation of the CI' complex which is not 
otherwise present in the yeast hqst cells or that it may be (or mimic) such a 
protein. 

From these results, compound W would be identified as an inhibitor 
of C2' complex formation and/or DNa binding activity and thus of c-Myc 
transcriptional activity in vivo. 

All references cited herein are fully incorporated by reference. 
Having now fully described the invention, it will be understood by those 
with skill in the art that the scope may be performed within a wide and 
equivalent range of conditions, parameters and the like, without affecting 
the spirit or scope of the invention or any embodiment thereof. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION z 

(i) APPLICANT! Kingston, Robert E 
Papoulas, Ophelia 

(ii) TITLE OF INVENTION: C-MYC DNA BINDING PARTNERS , 
MOTIFS, SCREENING ASSAYS , AMD USES THEREOF 

(Hi) NUMBER OF SEQUENCES: 101 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEES Sterne, Kessler, Goldstein 6 Fox 

(B) STREET: 122S Connecticut Avenue, N.W., Suite 300 

(C) CITY: Washington 

(D) STATES DC 

(E) COUNTRY: USA 

(F) ZIP: 20036 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC— DOS /MS— DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1*25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(▼Hi) ATTORNEY /AGENT INFORMATION x 

(A) RAKE: Cimbala, Michele A 

(B) REGISTRATION NUMBER: 33,851 

(C) REFERENCE /DOCKET NUMBER: 0609 .3440004 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONES (202)833-7533 

(B) TELEFAX: (202)833-8716 



(2) INFORMATION FOR SEQ ID NO:l: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GACCACGTGC TC 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairo 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESSs single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPEs DNA 

(xl> SEQUENCE DESCRIPTIONS SEQ ID NO:2s 
GACCACGTGG TC 12 

(2) INFORMATION FOR SEQ ID NO: 3: 

•# 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 12 bue pairs 

(B) TYPEs nucleic acid " 

(C) STRANDED NESS : single 

(D) TOPOLOGY x linear 

(11) MOLECULE TYPES DNA 

( X i) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 
AACACTYCTG TT 

(2) INFORMATION FOR SEQ ID NO: 4: 



Cl) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 17 base pairs 

(B) TYPES nucleic acid 

(C) 'STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 4 s 
COGATATCTC GAGACGG 
(2) INFORMATION FOR SEQ ID NO: 5: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID- NO: 5s 
CCACGATCTA GATTCG 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 



BNSDOCID: <WO 9308701A1_L> 



SUBSTITUTE SHEET 



WO 93/08701 



- 55 - 



PCIYUS92/08603 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: 
GATCTCTGCA GCAGCTGGCA GCAGCTGGCA CCAGCTGCCG 40 
(2) INFORMATION FOR SEQ ID NO: 7: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH j 40 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GATCTGCAGT CATGTGGCGT CATGTGCOGT CATGTGGCAG 40 
(2) INFORMATION FOR SEQ ID NO: 8: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(U) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GATCTGCAGT CACGTGGCGT CACGTGGCGT CACGTGGCAG 40 
(2) INFORMATION FOR SEQ ID NO: 9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 9: 

TOGACGTCGC AGCAGCTGCA G 21 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 21 base pairs 
(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY = linear 

(ii) MOLECULE TYPEs DNA 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOslOs 

21 

TCGACCCCAC CAGCTGGCGA C 
(2) INFORMATION FOR SEQ ID NOslls 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH* 40 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION s SEQ ID NO: lis 

40 

AGCTTCGAAC ACCTGCAGCA CCTGGCAGGA AGCAGGCCTA 
(2) INFORMATION FOR SEQ ID NOsl2$ 

Ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 40 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGYs linear 

(ii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOsl2s 
AGCTTTAAAA TCCCCACCAG CTGGCGAAGC AACAGGTGCA ~~ 40 
(2) INFORMATION FOR SEQ ID NO: 13 s 

fil SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 39 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGYs linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOsl3s 
AATTGCGAAA CCCCTGGAAT ATTCCGACCT CGCAGCCTC 39 
(2) INFORMATION FOR SEQ ID NOsl4s 

iL\ SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 28 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS t single 

(D ) TOPOLOGY: linear 

(11) MOLECULE TYPE s DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TCGACTTTAG ACCACGTGGT CCCCTCGA 28 

(2) INFORMATION FOR SEQ ZD NO: 15: 

# 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : ©Ingle 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO; 15: 
GGACGATCTA GATTCGNNNN HNNNNNNNNN NNNNNNCCGT CTCGAGTATC GG 
(2) INFORMATION FOR SEQ ID NO: 16: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: a Ingle- 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
NNCANNTGNN 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GCAGAATCTA CCACCTGCTC C 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS; single 

(D) TOPOLOGY z linear 

(ii) MOLECULE TYPE; DNA 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO 2 18 5 

20 

GGGGCTACCA CGTCCTTATG 

(2) INFORMATION FOR SEQ ZD NO: 19; 

(i) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 20 base pairs 

(B) TXPBs nucleic acid 

(C) STRANDEDNESS l single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

20 

GGACGAAAGC ACGTGCTCCG 

(2) INFORMATION FOR SEQ ID N0:20: 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD} TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

20 

CCACATGACC ACGTGCTCTG 
(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 21: 

20 

CGCAGAGACA CGTGCCCTGG 
(2) INFORMATION FOR SEQ ID NO;22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 
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( C ) STRAND ED NESS s s ingle 

(D) TOPOLOGY z linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

20 

GGCAAACCAC CTGTTATGTG 

(2) INFORMATION FOR SBQ ID NO: 23 s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 22 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOCTs linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

22 

CGACCACGTG CTCTTCGACT TG 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 23 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDED NESS : single 
JD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

23 

GCACAATTTG TACCACGTGG CCG - 
(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
GGACAACATC GACCACGTGG CCG 13 
(2) INFORMATION FOR SEQ ID NO:26: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

<i£) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs26s 
CCCTCCATGA CCACGTGGAC C 21 
(2) INFORMATION FOR SEQ ID NO s 27 s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 21 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY x linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTION s SEQ ID NOs27s 

21 

GCAAATATGA CCACGTGGTA C 

(2) INFORMATION FOR SEQ ID NO s 28s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESSs single 
fD) TOPOLOGYs linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTION s SEQ ID N0s28s 

20 

GGACCACGTG CTCTTTTGTG 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 21 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGYs linear 

. (ii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29s 

21 

GGCATAAACT CCACGTGGTC C 

(2) INFORMATION FOR SEQ ID NO: 30s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 22 base pairs 

(B) TYPEs nucleic acid 
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(C) STRANDED NESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:30: 
CGGGCACGTG CTCCTCGGAC TG 
(2) INFORMATION FOR SEQ ID, NO: 31: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 
GGTAGCAAAA AGCACGTGCC CG 22 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 toaee pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
GCGGGATTTA AGCACGTGCT CC 22 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CACCTATTAA CCACGTGGTA C 21 

(2) INFORMATION FOR SEQ ID NO:34: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
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( B ) TYPE: nucleic acid. 
(C> stranded NESS - single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO:34s 

24 

CACCACGCGC CATCCACGTG CCCT 

{2} INFORMATION FOR SEQ ID NO 1 35: 

CD SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 20 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(Xl> SEQUENCE DESCRIPTIONS SEQ ID NOs35s 
GCGGACCACG TGCTCGGTTG 
{2) INFORMATION FOR SEQ ID NOs36s 

/II SEQUENCE CHARACTERISTICS* 

(A) ISNGTHs 22 base pairs 

(B) TYPES nucleic acid 
fC| STRANDEDNESSs single 
(D) TOPOLOGY* linear 

, (li) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO:36s 
CACATATTAG ACCACGTGCT CC 
(2) INFORMATION FOR SEQ ID NOs37s 

111 SEQUENCE CHARACTERISTICS s 
(A) LENGTH: 23 base pairs 
CB) TYPES nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY s linear 

(11) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOs37s 
CGGCCACGTG CTCACTGTCT ACC 
(2) INFORMATION FOR SEQ ID NOs38s 

fil SEQUENCE CHARACTERISTICS s 
(A) LENGTH: 20 base pairs 



22 



23 



9308701 A 1 l_> 



SUBSTITUTE SHEET 



WO 93/08701 



- 63 - 



PCT/US92/08603 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGYs linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

20 

GGATGGACAG CTTCTTCCTG 
(2) INFORMATION FOR SEQ Xb NO: 39: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS* single 

(D) TOPOLOGY-: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39 : 

20 

CCAATCCCCC GCTGCTOGCC 
(2) INFORMATION FOR SEQ ID NO: 40: 

(11 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) .STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

22 

GCCAAAAATG TACAGCTGTG CC 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 41: 

23 

CGGCCAGGAG CTCATGAATG TGC 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) T3TPEs nucleic acid 

(C) STRANDEDNESS r single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

* 20 

GCACGCTGTA CGTGACTTGG 

(2) INFORMATION FOR SEQ ID kO:43: 

(1) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 20 base pairs 

(B) -TXPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOI-OGYs linear 

(ii) MOLECUI*E TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43: 

20 

CCGGAGTCCX GGTGCTCTGC 

(2) INFORMATION FOR SEQ ID NO: 44 s 

(1) SEQUENCE CHARACTERISTICS S 

(A) XENCTHt 23 base pairs 

(B) TYPEs nucleic acid 

(C) STRAHDEDNESSx single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

23 

CACTAAGAAA TACCACGTGG CCG 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) U2NGTH: 21 base paire 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOIjOGY: linear 

(ii) HOLECUIX TYPE: DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 45: 

21 

ggggaiitaa gcacgtgctc c 

(2) information for seq id no: 46: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTHS 23 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

23 

CGCCCACGTC CC TT C TT TC T CCG 

(2) INFORMATION FOR SEQ ID «NOx47: 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 24 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY* linear 

(11) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

24 

CATACTCCAC AGAGCAOGTG CGAA 

(2) INFORMATION FOR SEQ ID NO: 48: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) JSTRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
CATAAGTCAG ACCACGTGCC CG 22 
(2) INFORMATION FOR SEQ ID NO: 49: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

CCCAACTAAG ACCACGTGCC CG 22 

(2) INFORMATION FOR SEQ ID NO: SO: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
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(B) TYPES nucleic acid 

(C) STRANDEDNESS z single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TXPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 50: 

CAGTCGAAGA GGCCACGTGG CGA 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
(8) TYPE: nucleic add 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CGTAGGTTAT TCCCACGTGG COG 
(2) INFORMATION FOR SEQ ID NO:S2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: .single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
CATAAATAGG CCACGTGCTC C 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

GGAAAATGTA CCACGTGCTC C 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESSs single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
GGAACAGACC ACGTGGCTTG 20 
(2) INFORMATION FOR SEQ ID' NO:SS: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

<xl) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

■ 20 

GTACCACGTG CTTTTTTGGC 

(2) INFORMATION FOR SEQ ID NO: 56: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : -single 

(D) TOPOLOGY $ linear 

(II) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

22 

CAGTCCGAGG AGCACGTGCC CG 

(2) INFORMATION FOR SEQ ID NO: 57: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: DNA 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

CGGCCACGTG TCGAGCATGA GTC 23 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
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(B) TYPEs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:58: 

23 

CGGCCACGTG CTCGTAAATT TGC 

(2) INFORMATION FOR SEQ ID NO;S9s 

fi) SEQUENCE CHARACTERISTICS; 

(A) I*ENGTH: 23 ba«e P*** 8 

(B) TYPES nucleic *cia 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPEs DNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:59s 
CCCACAAAAT TACCACGTCG CCG 
(2) INFORMATION FOR SEQ ID NO; 60: 

il\ SEQUENCE CHARACTERISTICS s 

(A) XENCTHs 22 toa«e pairs 

(B) TYPES - nucleic *cid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE; DNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID N0s60; 
OGCAAAATCG ACCACGTGGT CC 
(2) INFORMATION FOR SEQ ID NOs€ls 

(1) SEQUENCE CHARACTERISTICS; 
11 f A) LENGTH; 23 base pairs 

(B) TYPE; nucleic acid 

#C) STRANDEDNESS; single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE; DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO;61; 
GCATAAGTAA TACCACGTGO CCG 
(2) INFORMATION FOR SEQ ID NO;62s 
(i) SEQUENCE CHARACTERISTICS: 



22 
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(A) LENGTH: 20 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
GCAAAAAAAC CACCTGCTCC 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
GGGGCCGGAA CTC CG TT C TC 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPEs . nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 
GGGGACCCGA TCTCTCGCTG 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAHDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 65: 
CAATAATATT TCCTTTCCTG 
(2) INFORMATION FOR SEQ ID NO: 66: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 24 base paxrs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

<xi) SEQUENCE DESCRIPTlONr SEQ ID NO: 66: 

24 

GTCCACGCGG CATCCACGTG CCGT 
(2) INFORMATION FOR SEQ ID NO:67s 

tU SEQUENCE CHARACTERISTICS: 
11 <A) LENGTHS 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS J single 
<D) TOPOLOGY: linear 

(li) KOLECULE TYPE! DNA 

(xi) SEQUENCE. DESCRIPTIONS SEQ ID NOs67s 
CGGCCACGTG CICTATACAT GCC 
(2) INFORMATION FOR SEQ ID NOs68s 

tlx SEQUENCE CHARACTERISTICS s 
1 ' (A) LENGTHS 20 base pairs 

(B) TYPEs n«clelc -acid 

IC) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOr68: 
GGACCACGTG CTTATCTTTG 
(2) INFORMATION FOR SEQ ID NOs69s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
#C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOs69s 
CGACCftCGTG TTCCGCTACT CG 
(2) INFORMATION FOR SEQ ID NOs70s 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE i nucleic acid 
<C) STRAND ED NESS t single 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: 
CGAGTAGCGA CCACGTGTTG C , 21 
(2) INFORMATION FOR SEQ ID NO :71s 

(i) SEQUENCE CHARACTERISTICS: 

(X) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 

( X i) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

21 

CCACCACGTG CTTACCATGT C 

(2) INFORMATION FOR SEQ ID NO: 72: 

(1) SEQUENCE CHARACTERISTICS : 
(X) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRXNDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOr72: 

20 

GGACAAAAAG CACGTGCTAC 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73:. 

20 

GCAAAACTCC ACGTGGTCGG 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs74s 

GGGCftAAAAC AACAGCTGTG CC ^ 

(2) INFORMATION FOR SEQ ±D NOs75s 

fil SEQUENCE CHARACTERISTICS s 
(A) LENGTHS 21 base pairs 
|B) TYPES nucleic acid 
fC) STRANDEDNESSs single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 

(xi> SEQUENCE DESCRIPTIONS SEQ ID NOs75s 
GGGAAAGAGA TCAGCTGTGC G 
(2) INFORMATION FOR SEQ ID NOs76s 

#11 SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 21 base pairs 

(B) TYPES nucleic acid 
CC1 STRANDEDNESSs single 
(D) TOPOLOGYs linear 

(ii) MOLECULE TYPES DNA 

(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOs76s 
GGAGAATTCA ACAGCTGACC C 
(2) INFORMATION FOR SEQ ID NOs77s 

Ii) SEQUENCE CHARACTERISTICS s 
1 (A) LENGTHS 23 base pairs 

(B) TYPEs nucleic *f» 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 

( xl) SEQUENCE DESCRIPTIONS SEQ ID NOs77i 
GGGACAAACC AGTCAGCTGG CCG 
(2) INFORMATION FOR SEQ ID NOs78s 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 78: 
GGGCACAGCT GTTTAGTGGG 
(2) INFORMATION FOR SEQ 10 NO:79: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY s linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 79s 

GCCAAGCGCA CAGCTGTTCC 

(2) INFORMATION FOR SEQ ID NOs80s 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTHS 20 base pairs 
<B) TYPES nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY s linear 

(11) MOLECULE TYPEs DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
GGCATTGATC AGCTGTGTGG 
(2) INFORMATION FOR SEQ ID NO:81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPES DNA 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID HO:81: 
GCAAAAACCA GCTGGTCCCC 
(2) INFORMATION FOR SEQ ID NO: 82: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH z 21 base paxrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO:82: 

21 

CGCAAGTGTA ACAGCTGGTG C 

(2) INFORMATION FOR SEQ ID NO: 83: 

fil SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS s * ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: ^ 
GGATGGTTTT TTTTTTGTAC 
(2) INFORMATION FOR SEQ ID NO*84: 

(1) SEQUENCE CHARACTERISTICS: 
* (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 

-20 

GCATGATTTT CTTTTTGTCC 

(2) INFORMATION FOR SEQ ID NO: 85: 

fil SEQUENCE CHARACTERISTICS: 
i } (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: 

20 

CAGAGTTTTT TTGAGCCCCC 
(2) INFORMATION FOR SEQ ID NO: 86: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS z single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

20 

GCAAAAAATA AAAATACATC 
(2) INFORMATION FOR SEQ ID NO: 87: 

CD SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 

20 

GGCAAAAAAG TCAAAATACG 
{2} INFORMATION FOR SEQ ID NO: 88: 

(1) SEQUENCE .CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPEs DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

20 

GCACAATAAA AAACTTTGCG 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

20 

CCATATGTTC ATTGTTGTCC 
(2) INFORMATION FOR SEQ ID NO: 90s 
<i) SEQUENCE CHARACTERISTICS: 



BNSDOCID: <WO 9308701A1_I_> 



SUBSTITUTE SHEET 



WO 93/08701 



PCI7US92/08603 

- 76 - 



(A) LENGTH : 20 base pairs 

(B) TYPE 2 nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

20 

CACAAAAATT TAGTGTGTGC 

(2) INFORMATION FOR SEQ ID* NO:91: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

23 

CGGCCCCGTG CTCTAGCCCA TGC 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

21 

CGGGGAAGTC CCAAGTGCCC C 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic -acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

22 

CACAGGAACA TACACGGGCC CG 
(2) INFORMATION FOR SEQ ID NO:94: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 24 base pairs 

(B ) TYPEs nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY 2 linear 

(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
GGGACGCGAT GATTGACGTG CCGT 
(2) INFORMATION FOR SEQ ID NO: 95: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 
CGCAAGCGAC CTCAGTCCTG 
(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOs96s 
CACCTACCAC TGATCGCGGC 
(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 

fii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs97s 
GGACAAACAT CCCATTACCC 
(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) strandedness z single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 98s 

20 

GGGGATGGAA CATCGCGCTG 

(2) INFORMATION FOR SEQ ID NOs99s 

(1) SEQUENCE CHARACTERISTICS S 

(A) LENCTHs 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS 2 single 

(D) TOPOLOGY s linear 

(11) MOLECULE TYPES DNA 

(3d J SEQUENCE DESCRIPTIONS SEQ ID NO: 99 s 
CCAGTCGGCC CTAACCGGCC 20 
(2) INFORMATION FOR SEQ ID NO: 100s 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGYs linear 

(il) MOLECULE TYPEs DNA 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 100s 
GGGAGCCATC GACGCCGGTG 
(2) INFORMATION FOR SEQ ID NO: 101s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGYs linear 

(11) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
"CCATAGGGGA GTTGACAGCC 
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WHAT IS CLAIMED IS: 

1. A method for the purification of Myc from a mammalian 
source, wherein said method comprises: 

(a) growing mammalian cells capable of expressing c-Myc; 
5 (b) inducing c-Myc expression in said cells; 

(c) lysing the membrane of said mammalian cells and purifying 
nuclei therefrom; 

(d) treating said nuclei in a buffer comprising DNase I; 

(e) solubilizing said nuclei in a buffer comprising sodium 

10 dodecyl sulfate or urea at concentrations greater than 4 M 

and separating the nuclear pellet from the supernatant 
fraction; 

(f) apolying said supernatant fraction of step (e) to a DEAE 
Sepharose CL-6B column and during bound c-Myc from said 

15 DEAE Sepharose CL-6B column with a salt gradient; 

(g) applying said c-Myc of step (f) to a FPLC Monc-Q column 
and eluting bound c-Myc with a salt gradient. 

2. A method for the detection of CI complexes in a sample, 
wherein said method comprises detecting DNA binding of c-Myc- 

20 containing homo-oligomers to the DNA motif 5'-CACGTG-3\ in its double 
stranded DNA form. 

3. A method for the detection of C2 complexes in a sample, 
wherein said method comprises detecting DNA binding of c-Myc- 
containing hetero-oligomers to the DNA motif 5'-CACGTG-3\ in its 

25 double stranded DNA form. 
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4. A method for the detection of C2 ' complexes in a sample, 
wherein said method comprises detecting c-Myc directed DNA binding to 
the DNA motif 5*-CAGCTG-3 ' , in its double stranded DNA form. 

5. A protein composition comprising at least one peptide 

5 capable of fanning a C2 complex, wherein said peptide capable of forming 
a C2 complex is found in a 26-29 kD protein fraction purified from 
Chinese hamster ovary cells or baculovirus. 

6. The protein composition of claim 5, wherein said protein 
composition is prepared from Chinese hamster ovary cells by a method 

10 comprising the steps of: 

(a) growing said cells; 

(b) lysine the membrane of said cells and purifying nuclei 
therefrom; 

(c) treating said nuclei in a buffer comprising DNase I; 
15 (d) solubilizing said nuclei in a buffer comprising sodium 

dodecyl sulfate or urea at concentrations greater than 4 M 
and separating the nuclear pellet from the supernatant 



(e) applying said supernatant fraction of step (e) to a DEAE 
20 Sepharose CLr6B column and the bound C2 complex protein 

from said DEAE Sepharose CL-6B column with a salt 
gradient; and 

(g) applying the eluted C2 complex protein of step (f) to a FPLC 
Mono-Q column and duting bound C2 complex protein with 
25 a salt gradient. 

7. The protein composition of claim 5, wherein said protein 
composition is prepared from baculovirus. 
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8. A protein composition comprising at least one peptide 
capable of forming a CT complex in the presence of c-Myc, wherein said 
peptide capable of forming a C2' complex in the presence of c-Myc is 
found in a 40-50 kD proton fraction purified from CHO cells. 

5 9. The protein composition of claim 8, wherein said protein 

composition is prepared from Chinese hamster ovary cells by a method 
comprising the steps of: 

(a) growing said cells; 

(b) lyang the membrane of said cells and purifying nuclei 
therefrom; 

(c) treating said nuclei in a buffer comprising DNase I; 

(d) solubilizing said nuclei in a buffer comprising sodium 
dodecyl sulfate or urea at concentrations greater than 4 M 
and separating the nuclear pellet from the supernatant 
fraction; 

(e) applying said supernatant fraction of step (e) to a DEAE 
Sepharose CL-6B column and the bound C2 ' complex 
proton from said DEAE Sepharose CL-6B column with a 
salt gradient; and 

(g) applying the eluted C2' complex protein of step (f) to a 
FPLC Mono-Q column and eluting bound C2' complex 
protein with a salt gradient. 

10. A method for objectively classifying compounds, including 
human pharmaceuticals, as inhibitors of c-Myc activity, wherein said 

25 : method comprises detecting the ability of said compound to inhibit CI 
complex formation, C2 complex formation or C2' complex formation. 

11. The method of claim 10, wherein said complex formation is 
CI complex formation. 
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12. The method of claim 10, wherein said complex formation is 
C2 complex formation. 

13. The method of claim 10, wherein said complex formation is 
C2' complex formation. 

14. A method for objectively classifying compounds, including 
human pharmaceuticals, as inhibitors of c-Myc activity, wherein said 
method comprises detecting the ability of said compound to inhibit CI 
complex DNA binding, C2 complex DNA binding, or C2' complex DNA 
binding. 



10 



15. The method of claim 14, wherein said DNA binding is CI 
complex DNA binding. 

16. The method of claim 15, wherein said DNA binding is to an 
oligonucleotide comprising the sequence 5*-CACGTG-3\ 

17. The method of claim 15, whereinsaid DNA binding is to an 
15 oligonucleotide comprising the sequence 5'-CATGTG-3'. 

18. The method of claim 14, wherein said DNA binding is C2 
complex DNA binding. 

19. The method of claim 18, wherein said DNA binding is to an 
ohgonucleotide comprising the sequence 5*-CACGTG-3\ 



20 



20. The method of claim 18, wherein said DNA binding is to an 
tfgonucleotide comprising the sequence 5'-CATGTG-3\ 
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21. The method of claim 14, wherein said DNA binding is C2 ' 
complex DNA binding. 

22. The method of claim 21 , wherein said DNA binding is to an 

# 

oligonucleotide comprising the sequence 5'-CAGCTG-3\ 

23. A method for the purification of a peptide capable of forming 
a C2 or C2 ' complex, or a mixture of such peptides from a crude prepara- 
tion, wherein said method comprises extraction of Chinese hamster ovary 
cells and assay of said peptide by detection of the ability of said peptide to 
form said C2 or said C2' complex. 

24. A method for identifying and classifying a compound as an 
inhibitor of c-myc hetero-oligomer DNA binding wherein said method 
comprises evaluating the ability of said compound to alter expression of a 
reporter gene in a host cell, wherein expression of said reporter gene is 
operably-linked to DNA binding by said hetero-oligomer to an 

15 oligonucleotide comprising the sequence 5*-CACGTG-3'. 

25. The method of claim 24, wherein expression of said reporter 
gene induces a phenotypic change in a host cell. 

26. The method of claim 24, wherein said reporter gene is larX. 

27. The method of claim 24, wherein said reporter gene is. CAT. 

28. The method of claim 24, wherein said reporter gene is 



10 



20 



LEW. 

29. The method of claim 24, wherein said phenotypic change is 
detected by visual inspection of the host ceU. 
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30. The method of claim 24, wherein said host is 5. cerevisiae. 

31. The method of claim 24, wherein said host is a mammalian 

cell. 

32 A method for identifying and classifying a compound as an 
inhibitor of c-MycKlirected CT hetero-oligomer DNA binding wherein sard 
method comprises evaluating the ability of said compound to alter 
expression of a reporter gene in a host cell, wherein expression of said 
reporter gene is operably-linked to DNA binding by said hetero-oligomer to 
an oligonucleotide comprising the sequence 5*-CAGCTG-3\ 

33. The method of claim 32, wherein expression of said reporter 
induces a phenotypic change in a host cell. 

34. The method of claim 32, wherein said reporter gene is tocZ. 

35. The method of claim 32, wherein said reporter gene is CAT. 

36. The method of claim 32, wherein said reporter gene is 



15 LEU2. 



37. The method of claim 32, wherein said phenotypic change is 
detected by visual inspection of the host cell. 

38. The method of claim 32, wherein said host is S. cerevisiae. 

39. The method of claim 32, wherein said host is a mammalian 



20 cell. 
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USE CACGTG 

MLC-A (kE2) CAGGTG 
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TITLE OF THE INVENTION 

C-MYC DNA BINDING PARTNERS, MOTIFS, SCREENING 
ASSAYS, AND USES THEREOF 

Cross-Rcfcrcncc to Related Applications 

5 This application is a continuation-in-part of U.S. Application No. 

07/510,253, filed April 19, 1990. 

Field of the Invention 

This invention is directed to methods for the purification of 
mammalian Myc protein, and methods for the identification of compounds 
10 that inhibit c-Myc transcriptional activity. 



BACKGROT TNn OF TBTF 



Myc is a nuclear oncogene whose aberrant expression is associated 
with many different types of human cancers in many different tissues 
(Cole, M.D., Ann. Rev. Genet. 20:361-384 (1986)). While the mechanism 

IS of c-Myc oncoprotein action remains unknown, it cleaiiy plays a role in the 
control of cell growth and differentiation (Lflscherand Eisenman, Genes & 
Dev. 4:2025-2035 (1990); Perm et al., Sem. Cancer Biol 1:69 (1990)). 
One plausible mechanism of Myc action is as a regulator of transcription in 
a pathway directly controlling proliferation and differentiation. This model 

20 is consistent with several observations. First, Myc has long been known as 
a nuclear protein with a general affinity for DNA (Abrams et a/., Cell 
2A427-439 (1982); Alitalo et aL 9 Nature 306:274*217 (1983); Donner 
et al. , Nature 25*5:262-265 (1982); Persson and Leder, Science 225:718- 
721 (1984)), and recently a site has been identified which is specifically 

25 bound by bacterially e xp re ss ed variants of c-Myc (Blackwell et al., Science 
250:1149-1151 (1990); Prendergast and Ziff, Science 257:186-189 (1991)). 
Second, full length c-Myc has been shown to both activate and repress 
genes in transient transfection assays (Kaddurah-Daouk et al. , Genes & 
Dev. 7:347-357 (1987); Yang et al., Mot. Cell. Biol. ii:2291-2295 
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(1991)), and will weakly stimulate transcription when fused to a 
heterologous DNA-binding domain (Lech et al., Cell 52:179-184 (1988); 
Kato et al., MoL Cell. Biol. 70:5914-5920 (1990)). And finally, sequence 
similarities described below place Myc in the company of known 
transcription factors. 

Myc contains two domains that suggest it oligomerizes, perhaps as a 
dimer, and binds specifically to DNA: a leucine zipper domain and a basic- 
helix-loop-helix (B-HLH) domain. The leucine zipper is an cr-helical 
structure found in sequence specific DNA-binding proteins such as Fos and 
Jun where it mediates homo- or heterodimerization via a coiled -coiled 
interaction (Landschulz etaL, Science 240:1759-1764 (1988); O'Shea 
etaL, Science 245:538-542 (1989); and reviewed in Busch and Sassone- 
Corsi, TIG 6:36-40 (1990)). This dimerization is necessary for DNA 
binding (Gentz etaL, Science 243:1695-1699 (1989); Halazonetis ex al., 
Cell 55:917-924 (1988); Kouzarides and ZafT, Nature 336:646-651 (1988); 
Turner and Tjian, Science 245:1689-1694 (1989)). The HLH region also 
appears to mediate oligomerization necessary for DNA binding in several 
developmentally important proteins (Murre et al., Cell 58:537-544 (1989); 
Murre et al., Cell 56:777-7*3 (1989)). HLH proteins form a large and 
growing family and include the products of the achaete-scute and 
daughterless genes responsible for neural development in Drosophila, the R 
gene family which regulates pigment pattern in com, MyoD and several 
other proteins involved in muscle specific differentiation in vertebrates, and 
a centromere binding protein, CBF1, from yeast (Braun et al, EMBO J. 
8:701-709 (1989); Cai and Davis, Cell 6*7:437-446 (1990); Caudy et al., 
CeU 55:1061-1067 (1988); Cronmiller et al., Genes A Dev. 2:1666-1676 
(1988); Davis et al., CeU 52:987-1000 (1987); Edmondson and Olson, 
Genes & Dev. 3:628-640 (1989); Ludwig and Wessler, Cell 62:849-851 
(1990); Pinney et al.. Cell 53:781-793 (1988); Rhodes and Konieczny, 
Genes <& Dev. 3:2050-2061 (1989); VUlares and Cabrera, Cell 50:415-424 
(1987); Wright etal.,Cell 56:607-617 (1989)). While many proteins 



contain either an HLH or leucine zipper motif, Myc is one of a smaller 
number of proteins which contain both an HLH and a leucine zipper. Both 
the leucine zipper containing proteins and the HLH proteins require a 
stretch of basic amino adds adjacent to the dimenzation motif to contact 
DNA (reviewed in Busch and Sassone-Corsi, 77G 5:36-40 (1990); Jones, 
N., Cell 67:9-11 (1990)). Interestingly, all B-HLH proteins appear to bind 
to closely related DNA sequences known as B-Boxes. These are sequence 
motifs found in the immunoglobulin and other tissue specific enhancers 
having a core of NNCANNTGNN [SEQ ED No. 16] where different central 
bases are preferred by different B-HLH proteins and the flanking bases can 
affect binding affinity (Blackwell et al, Science 250:1149-1151 (1990); 
Blackwell and Weintraub, Science 250:1104-1110 (1990)). The core of the 
reported binding site for c-Myc, CACGTG, fits this pattern and has the 
same core sequence as the upstream sequence element (USE) of the 
Adenovirus major late promoter (Blackwell et aL, Science 250:1149-1151 
(1990); Prendergast and Z2ff, Science 257:186-189 (1991)). A cellular 
transcription factor (USF or MLTF) which binds to the USE has recently 
been cloned and also contains a B-HLH domain adjacent to a leucine zipper 
(Gregor et al., Genes <4 Dev. 4: 1730-1740 (1990)). 

Many of these B-HLH or leucine zipper proteins have been found to 
form not only homodimers but heterodimers with other proteins having like 
di men zat io n motifs (reviewed in Busch and Sassone-Corsi, 77G 6:36-40 
(1990); Jones, N., Cell 67:9-11 (1990)). Heterodimerization between 
specific groups of B-HLH or leucine zipper proteins can alter their DNA 
binding properties. While homodimers might bind weakly, heterodimers 
with the appr op r iate partner can bind with increased affinity and in some 
cases with a new specificity (Jones, N., Cell 67:9-11 (1990); Blackwell and 
Weintraub, Science 250:1104-1110 (1990); Wright et aL, Mol. Cell Biol. 
77:4104-41 10 (1991)). Myc is capable of forming a homo-oligomer at high 
concentrations in vitro (Dang et aL, Nature 557:664-666 (1989); Kerkhoff 
and Bister, Oncogene 6:93-102 (1991)), although it is not clear whether 
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that homooligomer actually forms in vivo (Dang et al., Mol. Cell. Biol. 
77:954-962 (1991)). It seems likely that Myc directly interacts with other 
cellular protein(s) to form hetero-oligomerCs), and indeed one such 
"partner* protein, designated Max, has recently been identified (Blackwood 
5 and Esenmann, Science 257:1211-1217 (1991)). The effect that such 
partner proteins have on Myc DNA-binding specificity is likely to be 
central to understanding the function of Myc. 

Much of the in vitro work done on B-HLH proteins has utilized in 
vitro transcribed and translated protein or has used protein overexpressed in 
10 bacteria. Myc exp re ss ed by these means has been used to determine 

binding specificity and to demonstrate that Myc can form heterodimers with 
Max (Blackwell etaL, Science 250:1149-1151 (1990); Prendergast and 
Ziff, Science 257:186-189 (1991); Blackwood and Esenmann, Science 
257:1211-1217 (1991)). Myc, however, is post-translationally modified by 

15 at least phosphorylation in mammalian cells (Harm and Esenrnann, Mol. 

Cell. Biol. 4:2486-2497 (1984); Ramsay et aL, Proc. Natl. Acad. Sci. USA 
57:7742-7746 (1984)), and post-translational modifications are believed to 
regulate the function of many proteins, including the transcription factors 
Myb, Fos, HSF, CREB, and SP-1 (Abate et al.. Science 249: 1157-1 161 

20 (1990); Jackson et al.. Cell 63: 155-165 (1990); Luscher et al. , Nature 
344:517-522 (1990); Sorger et a/., Nature J29: 8 1-84 (1987); Yamamoto 
er al.. Nature 334:494-49* (1988)). In addition, Myc produced in avian 
cells has been reported to bind more tightly to DNA cellulose than 
bacterially produced Myc (Kerkhoff and Bister, Oncogene 5:93-102 (1991)). 

25 Several lines of evidence argue that the biochemical function(s) of 

Myc will be determined in large part by hetero-oligomerization with Max 
and perhaps with other, as yet unidentified, factors. A complete 
understanding of the function of c-Myc will therefore require the 
identification of all partner proteins and a functional characterization of the 

30 complexes that these proteins form in the absence or presence of c-Myc. 
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To unravel the nature of Myc's function it will be necessary to determine 
not only the binding properties of all relevant complexes but to ascertain 
how they differ in action once bound. Post-translational modification might 
play a role in modulating the formation, binding, or further activities of 
5 these complexes and the availability of large quantities of modified c-Myc, 
such as described here, should facilitate a biochemical approach to this 
problem. Such studies should lead us to an understanding of the complexes 
available at different times in different cell types and the consequences for 
each cell in terms of appropriate growth and differentiation, or 

10 oncogenesis. 

Further, to date, no inhibitors of c-Myc action have been identified. 
The identification of such inhibitors has suffered for lack of identification 
of a specific DNA binding sequence to which c-Myc binds, and for ja^y of 
a simple, inexpensive and reliable screening assay which could rapidly 

IS identify potential inhibitors and active derivatives thereof. Thus a need also 
still exists for rapid, economical screening assays which identify specific 
inhibitors of c-Myc activity. 

SUMMARY OF T HE INVENT? ON 



Recognizing the potential importance of inhibitors of c-Myc 
20 oncoprotein activity in the therapeutic treatment of many forms of cancer, 
and cognizant of the lack of a simple assay system in which such inhibitors 
might be identified, the inventors have investigated c-myc DNA binding. 

These efforts led to the development of a mammalian cell line that 
overexpresses Myc and the purification of significant quantities of c-Myc 
25 from these cells. These efforts culminated in the discovery of three types 
of c-Myc-driven protein oligomerization (or complex) formations: (1) 
homo-oligomer complexes (herein termed CI complexes) formed by 
association of at least two peptides of c-Myc t (2) hetero-oligomer 
complexes (herein termed C2 complexes) formed by heterodimerization of 
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at least two peptides, at least one of which is not the c-Myc peptide, and 
specifically hetero-oligomerization between c-Myc and a 26*29 led factor, 
and (3) c-Myc-dependent hetero-oligomexic complexes (herein termed the 
C2* complex) formed in the presence of c-Myc, however such hetero- 

5 oligomeric proteins not containing any peptides which are c-Myc. 

Accordingly, the invention is directed to a reliable and accurate 
method for the purification of Myc from a mammalian source. 

The invention is further directed to the use of oligomers containing 
the DNA motif S'-CACGTG-S*, in its double stranded DNA form, as a 

10 reliable and accurate method for the detection of die presence of CI 
complexes in a sample. 

The invention is further directed to the use of the DNA motif 5'- 
CACGTG-3\ in its double stranded DNA form, as a reliable and accurate 
method for die detection of C2 complexes in a sample. 

15 The invention is further directed to die use of the DNA motif 5'- 

CAGCTG-3', in its double stranded DNA form, as a reliable and accurate 
method for the detection of C2* complexes in a sample. 

The invention is further directed to a 26-29 kD protein fraction 
purified from Chinese hamster ovary (CHO) cells or baculovirus, such 

20 protein fraction containing at least one peptide capable of forming C2 
complex oligomers with c-Myc. 

The invention is further directed to a 40-50 kD protein fraction 
purified from CHO cells, such protein fraction containing at least one 
peptide capable of forming C2 * complex oligomers in the presence of c- 

25 Myc. 

The invention is further directed to a reliable and accurate method 
for objectively classifying compounds, including human pharmaceuticals, as 
inhibitors of c-Myc activity, and especially as an inhibitor of CI complex 
formation, C2 complex formation or C2' complex formation. 
30 The invention is further directed to a reliable and accurate method 

for objectively classifying compounds, including human pharmaceuticals, as 
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inhibitors of c-Myc activity, and especially as an inhibitor of CI complex 
DNA binding, C2 complex DNA binding, or C2' complex DNA binding. 

The invention further provides a method for identifying and 
classifying the mechanism of action of a bioactive c-Myc-inhibiting 
5 compound. 

The invention further provides an assay for the monitoring of the 
isolation and/or purification of a peptide capable of forming a C2 or C2 ' 
complex, or a mixture of such peptides from a crude preparation. 

The invention further provides an assay for the monitoring of the 
10 isolation and/or purification of an c-Myc-inhibiting compound or mixture of 
such compounds from a crude preparation of such compounds. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1. Purified C-Myc Protein- A) 1 jig of c-Myc protein purified 
from the 5A overexpressing CHO cell line was subjected to 2-dimensional 

IS gel electrophoresis. An isoelectric focusing tube gel was run with pH 5-7 
ampholytes followed by SDS-PAGE and silver staining. The Myc proteins 
are bracketed and arrows distinguish the 60, 62, and 72 kD species. The 
gel was trimmed for this figure; the actual pi range for the Myc proteins 
was 5.0-5.6. B) 0.5 fig of purified c-Myc protein from the indicated cell 

20 lines was electrophoresed on an SDS gel and either visualized by silver 
staining (left lane) or electroblotted to nitrocellulose and subjected to 
immunoblotting using the ST-2 polyclonal antibody (right 2 lanes). 

Fig. 2. DNA Binding of Purified c-Mvc Proteins. The EMSA was 
carried out as described in materials and methods using equal amounts 

25 (approximately 2 ng) of the following probes and 0.5 /ig of either purified 
CHO produced c-Myc or baculovirus produced c-Myc: (mE2) 3 lanes 1 and 
7, 0xE3) 3 lanes 2 and 8, MLC-A lanes 3 and 9, MLC-B lanes 4 and 10, 
(USE) 3 lanes 5, 11, and 12, and HSE lanes 6 and 13. Full probe 
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sequences are given in materials and methods. Lanes 1-6 and lanes 7-13 
are different exposures of lanes from the same gel. 

Fig- 3. CI Bindinr Activity is Present in Mvc c ontaining Slices of 
SDS Gs& 400 fig of CHO produced c-Myc or 163 fig of baculovims 
5 produced c-Myc was separated on an SDS-PAGE gel. Proteins from 0.5 
cm slices were recovered, renatured as described in material? and methods, 
and analyzed by EMSA using the (USE)3 probe. 0.4 fig of the CHO Myc 
load and 5 fil of the protein from the CHO Myc-containing slice were 
analyzed on the same gel (left panel). 0.37 fig of the baculovims Myc load 

10 and 5 fd of the protein from the baculovims Myc slice were analyzed on 
the same gel (right panel). Slices from other molecular weight ranges of 
the same gel showed no binding (data not shown). 

Fig. 4. Activity is Formed bv c-Mvc and a 26-20 vn Factnr 
Proteins from gel slices were recovered and analyzed by EMSA as 

15 described in materials and methods using the (USE); probe. Lanes 1-4 
represent proteins from the same gel loaded with baculovims produced 
Myc described for Fig. 5. These lanes contain 0.37 fig of the loaded 
material (lane 1), 0.75 fig BSA with 7.5 fd of proteins from either a Myc 
slice (lane 2) or a 26-29 kD slice (lane 3), or 7.5 of each slice used for 

20 lanes 1 and 2 plus 0.2 fig of BSA (lane 4). Lanes 5-8 and 10 contain 

proteins from gels loaded with Myc purified from CHO cells. These lanes 
contain 0.47 of the gel load (lane 5), 4 fd of material from a Myc slice of a 
gel loaded with 400 fig of Myc (lane 6), 7 td of material from a 26-29 kD 
slice of a similar gel plus 0.8 fig Protein A (lane 7), and both 4 fd of the 

25 Myc slice and 7 /d of the 26-29 kD slice (lane 8). Lanes 9-12 u tilize the 
bacterially expressed Protein A-Myc fusion proteins containing either the 
Myc B-HLH and leucine zipper domains (amino acids 353-439) or lacking 
the basic region and containing Myc amino acids 372-439. These were 
expressed and purified as described in materials and methods. Lane 9 

30 contains 0.5 fig of Protein A-Myc(353-439) and lane 10 contains the same 

plus 7 fi\ of the 26-29 kD slice. Lane 1 1 contains 1 fig of Protein A- 
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Myc(372-439) and lane 12 contains 0.5 fig of Protein A-Myc(372-439) plus 
7 fd of the 26-29 kD slice. 

Fig. 5. C2' Binding Activity Requires a 4O-S0 Kd Factor A) 101 
Mg of CHO produced c-Myc was. separated on an SDS gel. Proteins were 
recovered, resuspended in 100 /d, and renatured and analyzed by EMSA 
using the ERP3/4 probe. This probe contains the portion of the MLC 
enhancer that encompasses the /iE2 site. EMSA samples contained 0.3 fig 
of the SDS gel load (lane 1), 7.5 fd of the proteins from the Myc slice 
(lane 2), or the 40-50 kD slice (lane 3), or 7.5 fd of both slices renatured 
together (lane 4). B) EMSA samples contained 0.9 #tg purified baculovirus 
produced c-Myc (lane 5), 3 fd of protein from the 40-50 kD slice of a gel 
loaded with 400 fig CHO produced c-Myc (lane 6), or both renatured 
together (lane 7,. The probe was ERP1/2. Q EMSA samples contained 
10 fd (0.9 /ig) of bacterially produced c-Myc fusion protein containing Myc 
amino acids 353-439 (lane 8), 0.47 fig of CHO produced c-Myc (lane 9), 5 
/d of protein from the 40-50 kD slice of a gel loaded with 400 fig of the 
CHO Myc shown in lane 9 (lane 10), or 5 $t\ of the same 40-50 kD 
material renatured in the presence of either Q.9 fig of the baculovirus 
produced Myc shown in lane 5 (lane 11), 2 pi (0.18 Mg) of the bacterially 
produced Myc fusion protein containing Myc amino acids 353-439 (lane 
12), or 4 fd (0.36 fig) of the same bacterially produced Myc fusion protein 
(lane 13). The probe was ERP1/2. 

Fig. 6. Antibodies to c-Mvc Interact with the CI and Yr? 
Complexes. EMSA reactions were set up with the indicated Myc protein 
preparations (0.37 fig baculovirus produced c-Myc or 0.47 pg of CHO 
produced c-Myc). These reactions were preincubated 30 min on ice in the 
presence of the indicated antibody (o-Myc monoclonal 1F7 or a 
monoclonal directed against the lambda repressor, cl). 1 ng of SMS probe 
or mE2 -containing probe number 7 (see Fig. 7) was added subsequently and 
binding and electrophoresis were as usual. 
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Fi S- 7 - Oligonucleotides SsJfiBgd from Panrf 9 ni Sequence after « 
Rounds of EMSA- Sequences were selected from oligonucleotides 
containing 20 base pairs of random sequence using a reiterative EMS A 
procedure described in materials, and methods. Underlined nucleotides are 
from the PGR primer sites. Tables below the aligned sequences tabulate 
the frequency of each base in the 6 flanking positions surrounding the 
CACGTG motifs. Only bases next to a perfect fit of the CACGTG core 
were tabulated since sequences without this core were found not to function 
as high affinity binding sites (Fig. 8, and data not shown). Bold numbers 
adjacent to individual sequences indicate those oligonucleotides which were 
tested individually by EMSA in Fig. 8. Asterisks indicate additional 
sequences which were tested individually (data not shown). 

Fig. 8. Selected Sites form Predicted Complex EMSA was 
carried out using either 2.8 ng of the SMS probe or equal amounts (1 ng) 
15 of probes 1-11 indicated in Fig. 7. Probes 1-11 were labeled and gel 

isolated in parallel and had approximately equal specific activities. Binding 
reactions contained either no additional protein (-), 0.37 fig of baculovirus 
produced c-Myc (B) or 0.47 ng of CHO produced c-Myc (Q. Free probe 
is visible at the bottom of the gel. 
20 Fig- 9. Off-Rate of the CI and CT1 Cojpjflggs, The standard 

EMSA reaction was scaled up for 11 samples containing 0.4 fig of purified 
baculovirus produced c-Myc per sample. Probe and competitor were 
(USE) 3 . After allowing 20 min for binding 20 pi was loaded on a prerun 
EMSA gel as a measure of the starting amount of complex (ST) and 
25 enough cold competitor was added to the remaining sample to achieve a 
250 fold molar excess over probe. Immediately upon addition of 
competitor the sample was gently mixed and 20 fil aliquots were loaded at 
the indicated times (0, 30 s, 1 min, 4 min, etc.). A control sample (C) 
was made up individually in which competitor was added prior to the start 
30 of binding to demonstrate complete competition. This sample was loaded 
at the same time as the ST sample. All samples were loaded on a 
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continuously running gel so that the complex in the starting lane runs ahead 
of the equivalent complex in lanes loaded later. 

DESCRIPTION OF THE 'PREFERRED EMBODIMENTS 

In the description that follows, a number of terms used in 
5 recombinant DNA technology are extensively utilized. In order to provide 
a clearer and consistent understanding of the specification and claims , 
including the scope to be given such terms, the following definitions are 
provided. 

Oligomer of Interest. As used herein, an 'oligomer of interest" 
10 refers to any of the following types of oligomeric proteins: first, Myc- 
containing oligomers including homo-oligomers of Myc peptides (a CI 
complex), and hetero-oligomers containing at least one peptide of Myc and 
one peptide of a Myc •partner* (a C2 complex); second, oligomers that 
form in the presence of Myc-containing homo-oligomers or Myc-containing 
IS hetero-oligomers but which themselves do not contain die Myc peptide, 
such oligomers including non-Myc-containing homo-oligomers that 
associate in the presence of Myc and non-Myc-containing hetero-oligomers 
that associate in the presence of Myc (a C2 * complex). 

Oligomer. An "oligomer" as it refers to proteins, means a protein 
20 composed of more than one peptide subunit, such as dimers, trimers, 

tetramers, etc. Such oligomeric protein may be a homo-oligomer, that is, 
composed entirely of two or more identical subunits; alternatively, such 
oligomeric protein may be a hetero-oligomer, that is, c omp ose d of at least 
two different peptides. Oligomers rwitainjng three or more peptides may 
25 contain more than one copy of a peptide. 

C2" ProtcinfsV As used herein, for convenience, a "C2 * protein" is 
a protein or peptide that is a member of the second class of the "oligomers- 
of-interest," that is, a protein that forms oligomers in the presence of Myc, 
c-Myc homo-olioomers or Myc-containinp hetero-olisomers so as to bind to 
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a specific DNA sequence, but which does not contain a Myc peptide, such 
oligomers including non-Myc-containing homo-oligomers that associate in 
the presence of Myc and non-Myc-containing hetero-oligomers that 
associate in the presence of Myq. 

S Operablv-Hnked . As used herein, two macromolecular elements are 

operably-linked when the two macromolecular elements are physically 
arranged such that factors which influence the activity of the first element 
cause the first element to induce an effect on the second element. For 
example, the transcription of a coding sequence which is operably-linked to 

10 a promoter element is induced by factors which "activate" the promoter's 
activity; transcription of a coding sequence which is operably-linked to a 
promoter element is inhibited by factors which " repr e s s* die promoter's 
activity. Thus, a promoter region would be operably-linked to the coding 
sequence of a protean if transcription of the coding sequence activity was 

IS influenced by the activity of the promoter. 

Response. As used herein, the term "response" is intended to refer 
to a change in any parameter which can be used to measure, indicate or 
otherwise describe c-Myc action or oligomer (homo-oligomer (CI complex) 
or hetero-oligomer (C2 complex)) formation, including c-Myc dependent 

20 hetero-oligomerization (C2' complex) formation. The response may be 

revealed as a physical change (such as a change in phenotype) or, it may be 
revealed as a molecular change (such as a change in a reaction rate or 
affinity constant). Detection of the response may be performed by any 
means appro p ri ate. "Detecting" refers to any method by which such 

25 response may be evaluated* so as to provide a meaningful indicia of whether 
the event has occurred. 

Compound , The term "compound" is intended to refer to a 
chemical entity, whether in the solid, liquid, or gaseous phase. The term 
should be read to include synthetic compounds, natural products and 

30 macromolecular entities such as polypeptides, polynucleotides, or lipids, 
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and also small entities such as neurotransmitters, ligands, hormones or 
elemental compounds. 

Bioactive Compound . The term "bioactive compound" is intended 
to refer to any compound which induces a detectable or measurable 

5 response in the methods of the invention. 

Promoter. A •promoter" is a DNA sequence located proximal 
to the start of transcription at the 5' end of the transcribed sequence. The 
promoter may contain multiple regulatory elements which interact in 
modulating transcription of the operably-iinked gene. 

10 Expression. Expression is the process by which the information 

encoded within a gene is revealed. If the gene encodes a protein, 
expression involves transcription of the DNA into mRNA, the processing 
of mRNA Of necessary) into a mature mRNA product, and translation of 
the mature mRNA into protein. 

15 A nucleic acid molecule, such as a DNA or gene is said to be 

"capable of expressing" a polypeptide if the DNA contains the coding 
sequences for the polypeptide and expression control sequences which, in 
the appropriate host environment, provide the ability to transcribe, process 
and translate the genetic information contained in the DNA into a protein 

20 product, and if such expression control sequences are operably-iinked to the 
nucleotide sequence which encodes the polypeptide. 

Cloning vehicle . A "cloning vehicle" is any molecular entity that is 
capable of delivering a nucleic acid sequence into a host cell for cloning 
purposes. Examples of cloning vehicles include plasmids or phage 

25 genomes. A pl&smid that can replicate autonomously in the host dell is 
especially desired. Alternatively, a nucleic add molecule that can insert 
into the host cell's chromosomal DNA is especially useful. 

Cloning vehicles are often characterized by one or a small number 
of endonuclease recognition sites at which such DNA sequences may be cut 

30 in a determinable fashion without loss of an essential biological function of 
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the vehicle, and into which DNA may be spliced in order to bring about its 
replication and cloning. 

The cloning vehicle may further contain a marker suitable for use in 
the identification of cells transformed with the cloning vehicle. Markers, 
5 for example, are tetracycline resistance or ampicillin resistance. The word 

• vector* is sometimes used for "cloning vehicle." 

Expression vehicle- An "expression vehicle" is a vehicle or vector 
similar to a cloning vehicle but is especially designed to provide sequences 
capable of expressing the cloned gene after transformation into a host. 
10 In an expression vehicle, the gene to be cloned is usually operably- 

linked to certain control sequences such as promoter sequences. 
Expression control sequences will vary depending on whether the vector is 
designed to express the operably-linked gene in a prokaryotic or eukaryotic 
host and may additionally contain transcriptional elements such as enhancer 
IS dements, termination sequences; tissue-specificity elements, and/or 
translations! initiation and termination sites. 

Host. By "host" is meant any organism that is the recipient of a 
cloning or expression vehicle. 



a. Isolation of c-Mvc Protein From Mammalian Cells and 
20 Preparation of Fractions Containing C2 and C2* Complex 

Binding Activity 



Although there have been previous reports of purified Myc protein, 
the present inventors found that the Mvc protein preparations described 
therein, and the methods used to isolate that protein, failed to achieve the 
25 requisite amount of yield needed to sequence characterize Myc action in 

mammalian sources. The inventors have overcome this problem and 
describe, for the first time, a unique and useful method for the isolation of 
highly purified mammalian c-Myc protein which provides the requisite high 
degree of quantity of mammalian c-Myc protein needed for the 
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characterization of c-Myc directed DNa binding and biological action. The 
inventors have also been able to purify large quantities of Myc from a 
recombinant insect cell system. The purified Myc protein of the invention 
exhibits the only known biochemical activity of c-Myc, an ability to bind 

5 die sequence CACGTG. As a direct result of die method of the invention 
for the isolation of c-Myc protein, the inventors were able to identify 
peptides that naturally associate with c-Myc in a heterooligomers, or 
peptides that naturally associate with each other as a result of the action of 
c-Myc, such peptides found to be present in certain column 

10 chromatography fractions of the c-Myc purification scheme. 

Accordingly to the invention, purification of Myc from a 
mammalian source is preferably achieved utilizing a mammalian cell line 
that over expresses either recombinant or non-recombinant c-Myc and is 
performed completely on ice or equivalent temperatures of 0-5°C, using 

15 reagents and buffers at the same temperature. For example, the 

over expressing Chinese hamster ovary (GHO) cell line 5A is useful for 
such purification. In CHO SA cells, recombinant mouse c-Myc is under 
the control of a regulatable promoter, and has been integrated and 
amplified in the genome of the parent CHO cell line for maximum stability 

20 and production. Except where otherwise noted, for the methods and assays 
of the invention, the native or recombinant Myc should include at least the 
two coding exons of Myc. 

After collecting the cells by centrifugation using techniques known 
in the art, and prior to lysis of the outer cell membrane, the cells should be 

25 washed at least once in a low salt neutral buffer such as 0*9% NaCl in 10- 
50 mM phosphate, pH 7.0-7.S (phosphate buffered saline, PBS) to remove 
remaining growth medium. 

Lysis of the washed cells is also achieved in a low salt, neutral to 
mildly acidic lysis buffer, preferably about pH 6.8, containing at least one 

30 protease inhibitor, such as aprotinin or phenylmethylsuifonyl fluoride 

(PMSF), preferably containing a combination of such inhibitors. Salts such 



BNSOOCID: <WO 93O8701A1 JA> 



SURSTmiTF CHFFT 



WO 93/08701 



PCT/US92/08603 



-16- 

as potassium (in the KC1 form) and magnesium (in the MgCl 2 form) are 
also preferably added. In addition, nonionic detergents such as NP40 (0.5% 
v/v) and Na-deoxycholate (0.1 %) should be added. 

Cell outer membrane lysis should be performed under conditions 
5 that lyse the host cell without lysing the nucleus, or induce significant 

leakage from the nuclear membrane. The cells may be allowed to sit for a 
short period of time, for example, 10 minutes, in the detergent-containing 
lysis buffer before mechanical intervention is utilized in the lysis step. 
Mechanical intervention is best performed with a gentle disruption of the 
10 detergent treated cells, for example, utilizing 40 strokes in a Bounce 
homogenizer with a type A pestle, or the equivalent of such treatment. 

Nuclei may be collected from the lysed cell preparation using 
t ec hni q u es known in the art, such as, for example, ceatrifugation at lOOOxg 
for 5 min at 4°C and washed at least once in the same low salt lysis buffer 
IS used to lyse the outer cell membrane. 

Nuclei are then rcsuspended in the low salt lysis buffer that 
additionally contains sufficient DNAse I and incubated for a time sufficient 
to efficaciously degrade the DNA in such nuclei to a size and viscosity that 
allows subsequent purification of the c-Myc from this preparation as 
20 described below. 

Following the DNAse I treatment, the sample is diluted with a high 
salt neutral buffer that brings die salt (as NaO) concentration of the sample 
to at least 2 M. Such high salt buffer preferably additionally also contains 
amounts MgCl 2 sufficient to maintain the same concentration of this, salt in 
25 the final diluted preparation, and also additional detergent NP40so as to 
retain efficacious levels after sample dilution. 

In mammalian host cells, c-Myc is generally tightly associated with 
the nuclei. Accordingly, it is necessary to solubilize c-Myc in a manner 
that does not destroy its biological activity or its ability to renature into a 
30 biologically active form. The residual nuclear material is first removed by 
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centrifugation and then the pellet resuspended for solubilization of the c- 
Myc. Solubilization of the c-Myc protein in a manner that destroys this 
association may be achieved with either sodium dodecyl sulfate (SDS) or 
urea at concentrations greater than 4 M. Preferably, SM urea is utilized. 

5 Residual non-lysed nuclei may also be solubilized at this time by vigorous 
stirring for about 30 min. The solution is then centrifuged to pellet any 
remaining insoluble material prior to the subsequent chromatography steps, 
for example, at SOOOxg for about 10 min. 

The supernatant fraction recovered from the centrifugation step is 

10 applied to a DEAE Sepharose CL-6B column equilibrated in the urea- 
containing buffer as described above, and the column thoroughly washed 
with such buffer to remove unbound protein. A second wash was 
performed with the addition of an intermediate amount of NaCl, 0.1M 
NaCl to the buffer. Finally, Myc protein was eluted by raising the salt 

15 concentration in the buffer to 0.35M. 

All protein eluting with the 0.35M salt wash were collected and 
applied to a FPLC Mono-Q column. The column was washed and with a 
gradient of 0.10 M NaCl to 0.35 M NaCl, followed by a 2 M NaCl step 
wash. Holding the gradient at intermediate salt concentrations, for example 

20 at about 0. 19 M NaCl, until the end tail of the contaminating protein is 
finished eluting will enhance the purity of die subsequently eluted Myc 
protein. 

Myc may be identified in the column duent by any technique that 
specifically recognizes Myc protein or its activity. For example, a 

25 monoclonal antibody such as 1F7 may he used in an immunoassay for the 
presence of Myc protein. Alternatively, DNA binding activity to an 
oligonucleotide containing the sequence 5'-CACGTG-3' may be used to 
monitor the purification. Monoclonal antibody 1F7 is directed against the 
peptide sequence of amino acids 305-317 in murine c-Myc. Other Myc 

30 monoclonal antibodies useful in such assays are commercially available. 
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Pools of fractions from this column contained the C2 and C2 ' 
binding activities described below, and the presence of peptides capable of 
entering into C2 and C2 ' hetero-oligomers, and especially C2 and C2 * 
hetero-oligomers, may be assayed by the ability of such hetero-oligomers to 
5 bind to the DNA sequences S'-CACGTG^' and 5'-CAGCTG-3\ 

respectively. Myc purified from the GHO cells appeared as multiple bands 
by immunoblot 

b. Purification of c-Myc and Its Partners From a 

Baculovinis Source 

10 Human c-Myc may also been purified using the baculovinis 

overexpression system. For purification, Sf9 cells that had been infected 
with recombinant baculovinis carrying the c-Myc gene, using techniques 
known in the art were harvested just prior to the onset of lysis (—48 hours 
post infection). Solubilization and purification of the recombinant c-Myc 

15 were carried out as with the CHO produced Myc resulting in a yield of 2.5 
mg/8xl0 8 cells. Myc purified from these insect cells was apparently 
homogeneous by silver staining, and ran on electrophoresis as a single 
diffuse band of ~60kD. This was in contrast to the multiple bands 
observed with mammalian Myc by immunoblot (Fig. IB). 

20 c. Detection of Sequence Specific DNA Binding Activity 

The above preparations contain two sequence specific DNA-binding 
activities that both contain Myc protein. The first activity contains only 
Myc (i.e., forms the Myc homo-oligomer) and binds very weakly to 
sequences with the core CACGTG. The binding is assayed by determining 
25 the off rate and by competitor assays, both techniques known in the art. 

The binding of c-Myc homo-oligomers is characterized by an immeasurably 
fast off rate and by the observation that it i> almost impossible to add 



BNSDOCID: <WO 9308701 A 1_IA> 



WO 93/08701 PCT/US92/08603 



-19- 

enough cold competitor sequence to completely compete away this complex 
in electrophoretic mobility shift assays (EMS A). This latter observation 
implies that it may not be possible to raise oligonucleotide concentrations 
above the K f>9 thus preventing the determination of exactly what fraction of 
5 the final Myc preparations are active for sequence specific binding by the 

Myc homo-oligomers. 

A binding site selection procedure may be used to determine the 
optimal binding site for Myc. Sites may be selected from a pool of random 
oligomers, such as 20-mers, in order to decrease bias in determining an 

10 optimal binding site. A 12 base consensus sequence of GACCACGTGCTC 
[SEQ ID No. 1] may be used, with the central E box core of CACGTG 
appearing to be most conserved. Halazonctis and Kandil (Halazonetis and 
Kandil, Proc Natl. Acad. ScL USA 88:6162-6166 (1991)) assumed that the 
flanking sequences might be symmetric, and reported an optimal sequence 

15 of GACCACGTGGTC [SEQ ID No. 2]. This sequence is quite similar to 
the consensus that is preferred here, differing in only the 10th position 
(where predominantly a C was utilized in the invention, although G is 
significantly represented Fig. 7, Group J). Accordingly to the invention, it 
is possible to select a 12 base consensus sequence from a pool of predicted 

20 complexity of 4 20 (~ 10 12 ) thus indicating that Myc has a strong sequence 
preference despite its apparent weak binding affinity. 

The second Myc containing DNA-binding complex provided in the 
preparations of the invention also binds to sequences with a core of 
CACGTG, but binds significantly more tightly than Myc alone. This 

25 complex (the C2 complex) requires a 26-29 kD factor in addition to Myc. 

This additional factor copurified with Myc, presumably because of similar 
chromatographic properti es and not via association with Myc since the 
chromatography performed in SM urea would denature such association. 
This additional factor resembles Max, a protein whose gene was recently 

30 isolated from mammalian cells, in that it does not bind efficiently to DNA 

by itself but can hetero-oligomerize with Myc to bind tightly to the 
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sequcncc CACGTG. However, that the factor of the invention differs from 
Max in its apparent size (Max is reported to migrate at 21 kD). 
Additionally, the Myc/Max hetero-oligomer appears to migrate at least as 
slowly as a Myc only complex in EMS As, while the C2 complex of the 
5 invention migrates more rapidly than Myc alone. 

In addition to the 26-29 kD factor, a second copurifying factor of 
40-50 kD has been identified. The sites selected by complexes containing 
this factor (herein termed C2* complexes) contained a CAGCTG core (the 
pE2 sequence motif) as well as flanking sequences which bear a striking 

10 resemblance to a recently reported binding site for myogenin homo 
oligomers (Wright et al. 9 Mol Cell Biol 17:4104-4110 (1991)). 
Myogenin is an HLH containing protein of predicted molecular weight 32.5 
kD whose optimal binding site is AACAGT/CTGTT [SEQ ID No. 3]. 
None of the sites (0/36) selected by the C2 or C2' complexes of the 

IS invention contained a CAGTTG motif while roughly half of the myogenin 
selected sites contained such core sequences. 



d. Assay for a Compound that Inhibits Mvc Action 

For the ease in describing these assays, CI complex association 
and/or DNA binding, C2 complex association and/or DNA binding, and 
20 C2' complex association and/or DNA binding are all referred to as c-Myc 
activity. 

Assays for c-Myc activity may be p er for med in vitro or in vivo. In 
vitro assays may be p e rform ed as described in the Examples, for example, 
by evaluating the effects the desired compound or various amounts of such 
25 compound on the results of the electrophoretic mobility shift assay and site 
selection techniques that will reveal whether binding of the oligomer of 
interest to a specific DNA sequence motif has occurred in the presence of 
the compound. 
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For the in vivo assay of a compound that inhibits the desired Myc 
activity at least two genetic constructs are utilized. First is required a 
recombinant construct capable of expressing Myc is required; second is 
required a reporter gene whose ex pr e ss ion is operably linked to the Myc 
5 activity and especially to the binding of the desired oligomer to the specific 
DNA sequence or motif. 

If desired, a recombinant construct capable of expressing a C2 
complex protein or C2' complex protein may also be used. Alternatively, a 
host may be chosen may be chosen that naturally expresses such protein. 
10 Recombinant constructs that are capable of expressing Myc protein 

may be constructed utilizing the guidelines as described below or purchased 
commercially. 

The desired DNA binding sequence may be operably linked to any 
gene which confers a selectable marker in the host system. In a prefer red 

IS embodiment, a marker gene which allows phenotypic selection in yeast, 
and especially in Sacchawmyces cerevisiae is used. 

Yeast that have been co-transformed with both an expressible myc 
gene and with the desired DNA binding sequence may be used to (1) 
identify the presence or absence of endogenous host proteins that interact 

20 with Myc in a C2 or C2 'complex (2) classify a protein as a CI complex 
protein or as a C2 * complex proton; and (3) identify and classify 
compounds as agents which disrupt such Myc activity. C2 complex proteins 
have previously also been termed Myc "partner" proteins. 

All three applications are based on the same principle: in the 

25 presence of c-Myc biological activity, one of three things will happen: CI 
complexes will form; G2 complexes will form; or, C2' complexes will 
form. The protein complexes so formed, and especially the oligomeric 
complexes, will bind to a specific DNA motif, binding to such motif will 
be operably linked to the marker gene, and expression of the marker gene 

30 will be altered, preferably stimulated, in response to such DNA binding. 



BNSDOCID: <WO 9308701A1_IA> 



*l IR^TITI ITP CUCCT 



WO 93/08701 



PCT/US92/08603 



-22- 

In the absence of such oligomerization, oligomer-DNA complex formation 
will not occur and expression of the marker protein will not be altered. 

In the assays of the invention, there may be some level of binding to 
a desired DNA binding sequence even in the absence of c-Myc. However, 
5 when c-Myc is present in the cell, die amount and strength of the specific 
DNA binding is increased. 

Hosts that have been co-transformed with both an expressible c-Myc 
gene and with the desired DNA binding sequence may be used to assay for 
the presence or absence of endogenous host proteins that interact with c- 

10 Myc activity. If such analyses reveal that the host contains c-Myc binding 
protons, or c-Myc dependent oligomers which, in die presence of c-Myc 
specifically bind to a desired DNA sequence, such c-Myc partner protein or 
dependent-oligomer protein may be isolated using techniques known in the 
art such as gel mobility shift analysis, cDNA expression cloning vectors 

IS such as, for example, XgtlO and Xgtll, or otter cloning systems 

specifically designed for high-efficiency cloning and expression of full- 
length cDNA in yeast such as, for example, pGl and pTRP56, all of which 
are commercially available (Clontech, Palo Alto, California). 

It is not necessary that the host be completely deficient in C2 

20 complex proteins (c-Myc partner proteins) or C2 ' complex proteins to be 
useful in the method of die invention. As described below, if c-Myc is 
expressed at levels much greater than those found in die host, reporter gene 
transcription from endogenous partner proteins may be negligible, or of 
such low amount that it does not otherwise alter the utility of the methods 

25 of the invention. 

If the c-Myc expression is transcribed with a strong promoter, 
and/or if the c-Myc expression cassette is supplied on a high copy number 
vector, the levels of c-Myc will be high enough to overcome a low level 
background and such c-Myc constructs may be used to analyze the ability 

30 of cloned c-Myc partners to influence c-Myc DNA binding. One of 
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ordinary skill in the art can adapt the expression system to the level of 
expression desired using methods known in the art. 

The C2 complex protein (the partner protein), or the C2 * complex 
protein, if supplied as a recombinant construct to the host cell, should be 
5 capable of expressing at levels comparable to that of the c-Myc protein. 

G2 complex proteins may be identified by utilizing a phage plaque assay, 
as described in the commonly-owned, copending U.S. patent application 
entitled "Protein Partner Screening Assays and Uses Thereof,* Application 
No. 510,254, filed April 19, 1990, and incorporated herein by reference. 

10 Proteins identified by such screening assay can be subclofted into 

eukaryotic expression vectors known in the art and commercially available 
so as to provide a recombinant source of partner protein gene expression. 

The genetic constructs of the invention may be placed on different 
plasmids, or combined on one plasmid. A construct may also be inserted 

15 into the genome of a host celL Preferably, the construct coding for the c- 
Myc protein and the construct coding for the C2 complex protein or the 
C2 ' complex protein are provided to the host on two different plasmids. 

It is important to establish that the effect of the compound is due to 
an effect on c-Myc activity and not an effect on the activity of the reporter 

20 product per se. Such effect can be established by comparing the results 

found in hosts which lack either the c-Myc expression vector or the C2 or 
C2 * protein expression vector or both. 

The desired DN A binding motif may be located at any site in the 
transcription cassette of the reporter gene which allows for the transcription 

25 of that gene to be operably-linked to binding of the desired oligomer. Thus, 
such motif may be located 5* to the transcriptional start site or 3* to the 
transcriptional start site, for example, in an intron, similar to its location 
relative to the promoter region in the immunoglobulin genes. 

The reporter gene whose expression is operably linked to c-Myc 

30 activity and especially to oligomer DNA binding may be any gene whose 

expression can be monitored. Any detectable phenotype change may serve 
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as the basis for the methods of the invention. In a preferred embodiment, 
the reporter gene is a gene not normally expressed by the host, or a gene 
that replaces the host's endogenous gene. Any reporter gene which is 
capable of being operably-linked to a promoter capable of responding to the 
binding of the oligomer of interest to the specific target DNA sequence 
may be used. 

Especially, for example, genes that endow the host with an ability to 
grow on a selective medium are useful. For example, in yeast, use of the 
yeast LEU2 gene as a reporter gene in strains that normally lack LEU2 
allows such yeast to grow on leucine as a sole carbon source. Expression 
the reporter gene is monitored by merely observing whether the host 
possesses the ability to grow on leucine. In a similar manner, use of the 
juc2 gene as a reporter gene would allow growth of the a suc2~ yeast host 
on sucrose to be used as the detection method. In both examples, growth 
on the indicated substrate could be used to indicate specific DNA binding 
of the oligomer of interest and lack of such growth could be used to 
indicate lack of binding or lack of oligomer formation. 

In another example, a construct (and host) which is gall+gallO' 
would respond to galactose in the medium; a construct (and host) which is 
loci* gall* would be lactose sensitive. Other reporter genes include his3 f 
ura3 and trp5. One of ordinary skill in the art can imagine many other 
appropriate reporter systems which would reveal the presen ce or inhibition 
of DNa binding or biological activity of the oligomer of interest. 

Reporter constructs in which the desired DNA sequence motif and 
the ioeZ reporter gene are operably linked' will express 0-galactosidase in 
response to binding of a c-Myc activity induced oligomer binding to such 
DNA sequence. Such expression can be easily scored by monitoring the 
ability of the host to produce 0-galactosidase (Maniatis, T. et al. , 
Molecular Cloning (A Laboratory Manual) y 2nd edition, Cold Spring 
Harbor Laboratory, 1989). The production of 0-galactosidase may be 
visually monitored by detecting its activity to reduce the chromophoric dye, 
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X-gal (commercially available from International Biotechnologies, Inc., 
New Haven, CT). 0-galactosidase reduces X-gal to a form which 
possesses a blue color. In another embodiment, the coding sequence of 
chloramphenicol acetyltransferase (CAT) is used as die reporter gene. 
5 Any detection method that can identify expression of the reporter 

gene may be used. For example, levels of the product of the reporter gene 
may be directly assayed with an immunoassay. Such immunoassays 
include those wherein the antibody is in a liquid phase or bound to a solid 
phase carrier. In addition, the reporter gene can be detectably labeled in 

10 various ways for use in immunoassays. The preferred immunoassays for 
detecting a reporter protein using the include radioimmunoassays, enzyme- 
linked immunosorbent assays (EUSA), or otter assays known in the art, 
such as immunofluorescent assays, chemiluminescent assays, or 
bioluminescent assays. 

IS In an assay to s cre e n for the ability of a compound to alter binding 

of the oligomer of interest, yeast strains that express such the desired 
peptide or peptides and which contain the related DNA binding sequence 
motif, may be plated and grown as lawns and the compound to be tested 
may be applied to the plates on a filter paper disk that is impregnated with 

20 such compound. Alternatively, the compound may be incorporated into the 
media within which the host cells are growing. 

One may be able to detect the ability of a compound to alter c-Myc 
activity by the appearance of a zone, which often resembles a halo, around 
the compound-impregnated disk. If for example, the compound is toxic to 

25 the host's survival per se, the host will not grow in the zone containing the 
compound. 

The methods of the invention can be used to s creen compounds in 
their pure form, at a variety of concentrations, and also in their impure 
form. The methods of the invention can also be used to identify the 
30 presence of such inhibitors in crude extracts, and to follow the purification 

of the inhibitors therefrom. The methods of the invention are also useful in 
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the evaluation of the stability of the inhibitors identified as above, to 
evaluate the efficacy of various preparations. 

The permeability of cells to various compounds can be enhanced, if 
necessary, by use of a mutant cell strain which possess an enhanced 

5 permeability or by using compounds which are known to increase 

permeability* For example, in yeast compounds such as polymyxin B 
nonapcptide may be used to increase the yeast's permeability to small 
organic compounds* In cells from the higher eukaryotes, dimethyl sulfoxide 
(DMSO) may be used to increase permeability* Analogs of such 

10 compounds which are more permeable across yeast membranes may also be 
used. For example, dibutyryl derivatives often display an enhanced 
permeability. 

In a preferred embodiment, the genetic constructs and the methods 
for using them are utilized in eukaryotic hosts, and especially in yeast, 
IS insect and mammalian cells The introduced sequence is incorp or ated into 
a plasmid or vector capable of either autonomous replication or integrative 
activity. 

The sequence of c-Myc is known (Battey, J. et al., Cell 34:779-787 
(1983)) and probes which are capable of identifying a c-Myc clone are 
20 commercially available (New England Nuclear/DuPont Biotechnology 
Boston, MA). 

The DNA sequence of the desired gene may be chemically 
constructed if it is not desired to utilize a done of the genome or mRNA as 
the source of the genetic information. Methods of chemically synthesizing 

25 DNA are well known in the art {Oligonucleotide Synthesis, A Practical 

Approach, M.J. Gail, ed., IRL Press, Washington, D.C., 1094; Synthesis 
and Applications of DNA and RNA 9 S.A. Narang, ed., Academic Press, 
San Diego, CA, 1987). Because the genetic code is degenerate, more than 
one codon may be used to construct the DNA sequence encoding a 

30 particular amino acid (Watson, J.D., In: Molecular Biology of (he Gene, 
3rd edition, W.A. Benjamin, Inc., Menlo Park, CA, 1977, pp. 356-357). 
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To express the recombinant constructs of the invention, 
transcriptional and translational signals recognizable by the host are 
necessary. A cloned protein encoding DNA sequence, obtained through the 
methods described above, (preferably in a double-stranded form), may be 
5 operably-linked to sequences controlling transcriptional expression in an 
expression Vector, and introduced, for example by transformation, into a 
host cell to produce recombinant proteins useful in the methods of the 
invention, or functional derivatives thereof. Such techniques are well 
known in the art {Recombinant DNA Methodology, Wu, IL et al. y eds., 
10 Academic Press, (1989); Maniatis, T. ct aL. Molecular Cloning (A 

Laboratory Manual), second edition, Cold Spring Harbor Laboratory, 
1989). 

Transcriptional initiatio n regulatory signals can be selected which 
allow for repression or activation of the expression of the c-Myc construct 

15 or construct of the recombinant G2 complex peptide (or the C2* peptide), 
or both, so that expression of such constructs can be modulated, if desired. 
Of interest are regulatory signals which are temperature-sensitive so that by 
varying the temperature, expression can be repressed or initiated, or are 
subject to chemical regulation, for example, by a metabolite, salt, or 

20 substrate added to the growth medium. 

Where the native expression control sequences signals do not 
function satisfactorily in the host cell, then sequences functional in the host 
cell may be substituted. 

Expression of the constructs of the invention in different hosts may 

25 result in different post-translational modifications which may alter the 

properties of the proteins expressed by these constructs. It is necessary to 
express the proteins in a host wherein the ability of the protean to retain its 
biological function is not hindered. Expression of proteins in yeast hosts is 
preferably achieved using yeast regulatory signals. The vectors of the 

30 invention may contain operabiy-linked regulatory elements such as 
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upstream activator sequences in yeast, or DNA elements which confer 
species, tissue or cell-type specific expression on an operably-linked gene. 

In general, expression vectors containing transcriptional regulatory 
sequences, such as promoter s eq uenc es and transcription termination 
sequences, are used in connection with a host These sequences facilitate 
the efficient transcription of the gene fragment operably-linked to them. 
In addition, expression vectors also typically contain discrete DNA 
elements such as, for example, (a) an origin of replication which allows for 
autonomous replication of the vector, or, elements which promote insertion 
of the vector into the host's chromosome in a stable manner, and (b) 
specific genes which are capable of providing pheaotypic selection in 
transformed cells. Eukaryotic expression vectors may also contain elements 
which allow it to be maintained in prokaryotic hosts; such vector are 
known as shuttle vectors. 

The precise nature of the regulatory regions needed for gene 
expression will vary between species or cell types and there are many 
appro p ri ate expression vector systems that are commercially available. 

In a highly preferred embodiment, yeast are used as the host cells. 
The elements necessary for transcriptional expression of a gene in yeast 
have been recently reviewed (Stmhl, K. Ann. Rev. Biochem. 58:1051-1077 
(1989)). In yeast, most promoters contain three basic DNA dements: (1) 
an upstream activator sequence (UAS); (2) a TATA element; and, (3) an 
initiation (I) element. Some promoters also contain operator elements. 
Methods in yeast genetics are well known (Struhl, K. Nature 507:391-397 
(1983); Sherman, et al. 9 Methods in Yeast Genetics, Cold Spring Harbor 
Laboratory (1983)). 

In another embodiment, mammalian cells are used as the host cells. 
A wide variety of transcriptional and translational regulatory signals can be 
derived for expression of proteins in mammalian cells and especially from 
the genomic sequences of viruses which infect eukaryotic cells. 
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Once the vector or DNA sequence containing the constmct(s) is 
prepared for expression, the DNA constructs) is introduced into an 
appropriate host cell by any of a variety of suitable means, for example by 
transformation. After the introduction of the vector, recipient cells are 

5 grown in a selective medium, which selects for die growth of vector- 
containing cells. Expression of the cloned gene sequences) results in die 
production of the protein. This expression can take place in a continuous 
manner in the transformed cells, or in a controlled manner. 

Genetically stable txansformants may be constructed with episomal 

10 vector systems, or with integrated vector systems whereby the fusion 

protein DNA is integrated into die host chromosome. Such integration may 
occur de novo within the cell or be assisted by transformation with a vector 
which functionally inserts itself into the host chromosome, for example, 
with retroviral vectors, transposons or other DNA elements which promote 

IS integration of DNA sequences in chromosomes. 

Cells which have been transformed with the DNA vectors of the 
invention are selected by also introducing one or more markers which allow 
for selection of host cells which contain the vector, for example, the 
marker may provide biocide resistance, e.g., resistance to antibiotics, or 

20 heavy metals, such as copp er , or die like. 

The transformed host cell can be fermented or cultured according to 
means known in the art to achieve optimal cell growth, and also to achieve 
optimal expression of the cloned protein sequence fragments. As described 
hereinbelow, a high level of recombinant protein e xpr e ssi on for the cloned 

25 sequences coding for.the proteins can be achieved according to a prefer red 
procedure of this invention. 

The methods of die invention are not intended to be limited to c- 
Myc and possess utility for the characterization of inhibitors against any 
Myc protein, such as, for example, N-Myc and L-Myc. The C2 complex 

30 peptides of the invention may interact with more than one Myc protein and 



BNSOOCID: <WO 9308701 A 1_IA> 



SUBSTITUTE SHEET 



WO 93/08701 



PCT/US92/08603 



-30- 



the C2* complex peptides of the inventions may form as the result of the 
activity of more than one Myc protein. 

The following examples further describe the m a terials and methods 
used in carrying out the invention. The examples are not intended to limit 
5 die invention in any manner. 

EXAMELES 

Example * 

Materials and Methods 



Cril Growth and Mve Qvereiroression: The 5 A cell line was maintained in 

10 spinner culture under selection with 80 uM methotrexate. Protein 

purification started with roughly 6 liters of cells at SxlOVml grown up 
without selection. Heat shock promoter induction was achieved by 
resuspension in preheated fresh media (43 *Q at 1/3 the original volume. 
Cells were incubated with stirring at 43*C for 1 h. To allow translation of 

15 the accumulated mRNA, cells were transferred to 37°C culture conditions 
for 3 h. Cells were then subjected to the purification described below. 

The baculovirus overexpression vector was constructed by insertion 
of the BamHl/Bcll fragment of pGEMMycB [Halazonetis and Kandil, 
Proc. Natl Acad. Sd. USA 88:6162-6166 (1991)] into the BamHl site of a 

20 baculovirus expression vector, pVL941, obtained from the laboratory of 
Max Summers (Texas A&M University, College Station, Texas). The 
resulting plasmid contained the entire coding sequence of the human Myc 
gene including 6 nucleotides 5' of the initiation codon and 3' untranslated 
sequence extending to the genomic Rsal site. Sf9 cells were grown and 

25 infected with recombinant baculovirus according to the methods of 

Summers [Summers and Smith, A Manual of Methods for Baculovirus 
Vectors and Insect Cell Culture Procedures, Texas Agricultural Experiment 
Station B ulletin No. 1555J with minor changes. Cells were passaged in 
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spinner culture and plated on 150 mm diameter tissue culture plates for 
protein production. Cells were infected and harvested approximately 48 h 
post infection by scraping. Cells were then washed in PBS and subjected 
to the purification described below. 

The Protein A-c-Myc fusion protein was expressed in the E. coli 
AR68 strain from a previously published pRTT2T vector [Dang, C.V., 
AnaL Biochenu 174:313-317 (1988)] which fused the Ig binding portion of 
protein A to either amino acids 353-439 or amino acids 372-439 of c-Myc. 
Growth and induction of the cells was as per Dang et at. [Anal, Biochem. 
774:313-317 (1988)]. 

Protein Purification : All purification steps were carried out on ice or with 
ice cold buffers unless otherwise stated. Cells may be used fresh or stored 
quick frozen in liquid nitrogen for larger batch preparations. SA or Sf9 
cells were washed in phosphate-buffered saline (PBS) and resuspended at 
15 2.1xl0 7 cells/ml in Low Salt Lysis Buffer (20 mM HEPES pH 6.8, 5 mM 
KC1, 5 mM MgCl 2 , 0.5% NP40, 0.1 % Na-deoxycholate, 1 /tg/ml 
aprotintn, and 0.1 mM PMSF) [Evan and Hancock, Cell 45:253-261 
(1985)]. After 10 min cells were subjected to 40 strokes in a Bounce 
homogenizer with a type A pestle. Nuclei were pelleted at lOOOxg, 5 min, 
20 4°C, washed once in 50 ml Low Salt Lysis Buffer, resuspended at 2.5x10 s 
nuclei/ml in Low Salt Lysis Buffer containing 50 pg/ml DNAse I and 
incubated at 4°C for 1 h. An equal volume of ice cold 2X High Salt 
Buffer (2x concentrations: 20 mM Tris, pH 7.4, 4 M Nad, 1 mM MgQ 2 , 
and 0.1% NP40) [Evan and Hancock, Cell 43:253-261 (1985)] was then 
25 added, mixed gently, and incubated for 10 min. The residual nuclear 

material (including the c-Myc protein) was pelleted (2000xg, 10 min, 4°Q 
and resuspended for solubilization at 5.5xl0 7 nucleus equivalents/ml in 
Buffer A (50 mM Tris, pH 8.0, 2 mM EDTA, 5 % glycerol, .1 mM DTT, 
and .1 mM PMSF) [Watt et <zi., MoL Cell. BioL 5:448-456 (1985)] 
30 containing 5 M urea (referred to as 5 M urea Buffer A) achieved by 



5 
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dilution of a freshly deionized stock of 6 M urea. This and all buffers used 
on columns were passed through 0.2 pore pm filter units. Residual nuclei 
were soiubilized by vigorous stirring on ice for 30 min. This protein 
solution was centrifuged (10 min, 5000xg, 4.Q to pellet any insoluble 
5 material prior to chromatography. The supernatant was loaded on a 10 ml 
DEAE Sepharose CLr€B (Pharmacia) column equilibrated with 5 column 
volumes of 5 M urea Buffer A. Sample loading was at 0.1 ml/min and 
column washing and elution were at 0.4 ml/min. After loading, the 
column was washed with 3 volumes 5 M urea Buffer A containing no 
10 additional salt followed by 4 volumes of the same buffer containing 0.1 M 
NaCL Myc protein was eluted in the following dution step at 0.35 M 
NaCL The protein containing fractions of this 0.35 M NaQ step were 
pooled and diluted with fresh 5 M urea Buffer A to 0.1 M NaQ and loaded 
onto a 1 ml FPLC Mono-Q column (Pharmacia) run at 0.5 ml/min. The 
15 Mono-Q column was eluted with a programmed gradient of 5 ml spanning 
0.10 M NaQ to 0.35 M NaQ followed by a 2 M NaQ step. For 
enhanced purity the gradient was held manually at approximately 0.19 M 
until the major contaminating protein finished during as determined by an 
in line UV monitor. In the initial development of the purification protocol 
20 fractions from the columns were assayed for Myc by slot blotting followed 
by visualization using the 1F7 monoclonal antibody and l25 I-labeled 
secondary antibody. For later preparations silver staining of SDS-PAGE 
allowed sufficiently unambiguous identification of the Myc proteins and 
provided an assessment of the purity of given fractions. The Myc 
25 containing fractions were pooled based on purity and dialyzed against 

buffer containing 20 mM Tris, pH 7.8, 50 mM KC1, 10 % glycerol, 0.1 
mM DTT, and 0.1 mM PMSF (referred to as Dialysis Buffer) in bags of 
SpectroPor 2 membrane for 3 changes, 2 liters each, for a minimum of 3 h 
each. Pools of fractions prepared this way contained CI and C2 (and C2') 
30 binding activities. To obtain pure CI binding activity the Myc-containing 
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Mono Q fractions were assayed by EMSA and those free of C2 binding 
activity were pooled and dialyzed separately. 

The bacterially produced Protein A-c-Myc fusion protein was 
partially purified by differential oentrifuganon and solubilized in 5 M urea 
5 according to Watt et aL [Bagcbi et aL , Mol. Cell. BioL 7:4151-4158 

(1987)] with the following minor modifications: Protease inhibitors were 
present in the initial lysis buffer (10 Mg/ml pepstatin, 1 mM PMSF, 50 
Mg/ml aprotinin, 2 ftg/ml leupeptin, 10 nuM Na-metabisulfite, and 1 mM 
benzamidine) and cells were sheared by 6 bursts of 15 s each in a Cuisinart 
10 MiniMate on ice. The urea solubilized material was cleared of insoluble 
material by centrifugation (lO.OOOxg, 10 nrin, 4*Q and dialyzed into 
Dialysis Buffer containing 0.5 mM DTT. Precipitated material was 
removed by centrifugation (15,000xg, 20 nrin, 4*Q. Protein A-Myc 
fusion protein was purified from the supernatant by IgG affinity essentially 
15 according to Nilsson et aL [EMBO J. 4:1075-1080 (1985)]. A 1 ml aliquot 
of supernatant was incubated with 0. 1 ml of a 50% slurry of IgG Sepharose 
6 fast flow (Pharmacia) rocking for 1 h at 4 # C. The pellet was washed 
twice with Buffer A and the fusion protein eluted^with 0.3 M lithium 
diiodosalicylate (LIS). The eluate was then dialyzed extensively to remove 
20 the LIS (initially against Buffer A at room temperature to avoid US 
precipitation, then against Dialysis Buffer 4°C). The two bacterially 
e xpr e s s e d Myc preparations were compared by Coomassie staining of SDS- 
PAGE to ensure mat equal amounts of the fusion proteins were used for 
experiments. 

25 u-Tgrminal Scouencine : The 3 bands of purified Myc from 5A cells were 
individually isolated by electroelution according to Hunkapiller et al. 
[A/em. Enz. Pi: 227-236 (1983)]. Preparative SDS-PAGE was carried out 
and protein bands excised after visualization with Coomassie Brilliant Blue 
R-250. Alter electroelution the material was precipitated 2 times with 
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methanol/acctone and submitted for N-lcrminal sequencing by Ed man 
degradation. 

Antibodies : The monoclonal antibody, 1F7 (a generous gift of R. 
Onzzonite, Hoffman LaRoche), is directed against the peptide sequence 
5 comprising amino adds 305-317 in murine c-Myc (Miyamoto et aL, Proc. 
Natl. Acad. Sd. USA 52:7232-7236 (1985)]. The antibody directed against 
cl was monoclonal 51F (Breyer and Sauer, J. BioL Chem. 264:13348- 
13354 (1989)] which had been purified by ammonium sulfate precipitation 
and chromatography on QAE Sephadex. 

10 Bectrophoretic Mobility Sh ift Assay fEMSAV Radiolabeled probes were 
produced via a Klenow fill in of annealed oligonucleotides containing 4 
base 5' overhangs at each end (see table below for sequences). Binding 
reactions took place in a final volume of 20 (d containing 2 ng of labeled 
probe, 125 ng poly d(IC), an indicated amount of protein, and the 

15 following final buffer conditions: 10 mM Tris, pH 7.5, 50 mM KC1, 0. 1 
mM EDTA, 1 mM DTT, 1 mM MgCl 2 and 5% glycerol. Binding 
reactions were allowed to proceed for 20 min at room temperature and 
were then loaded immediately on a 4% polyacrylamide gel which had been 
prerun at least 1 h at lOV/cm. Electrophoresis was for 1.5 h at lOV/cm in 

20 0.5x TBE. 

rut and Renature : The method of Bagchi et at [Bagchi ex al. , Mol. Cell 
Biol. 7:4151-4158 (1987)] was followed except for the final dialysis step. 
Precipitated protein samples containing BSA as carrier protein were 
solubilized in 6 M guanidme-hydrochloride (200 jd unless otherwise 
25 indicated) according to Bagchi et al. [Mol. Cell. Biol. 7:4151-4158 (1987)]. 
Directly prior to analysis by EMSA the samples were subjected to dialysis 
alone or in combination with another sample in a total volume of 15 ftl- 
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Equal volumes of each sample were used in a given experiment and the 
volume was brought to 15 /d using 6 M GuHO containing 0.1 mg/ml 
BSA. Dialysis was against 40 ml of Dialysis Buffer carried out for 1 h at 
4»C on floating 13 mm membrane discs (MBlipore #VSWP-013. pore size 
5 0.025 pm). 

Site Selection from Random Sentiences: Hie following procedure was 
devised based on the method of Pollock and Traataa [NucL Adds Res. 
J&6197-6204 (1990)]. A 52 base oligonucleotide "randomer" (see table 
below) was annealed to the following 16 base primer Xho I primer 5' 
10 CCX3ATATCTCGAGACGG 3\ [SEQ ID No. 4]. The annealed primer 
was extended using Klenow and nucleotides (0.2 mM cold dNTPs and 0.4 
pM a^P-dCTP 8000/mmol) to create a pool of double stranded probes 
representing approximately 4 20 sequences. The initial round of binding rite 
selection by EMSA utilized 200 ng of this pool and either 0.37 #»g of 
15 baculovirus produced c-Myc or 0.5 /ig of CHO produced c-Myc. Other 
parameters were as previously described for EMSA. Lanes containing 
randomer probes were alternated with reference lanes containing 2 ng 
(USE) 3 probe and 0.37 §tg of baculovirus c-Myc. The completed EMSA 
gel was electroblotted onto NA45 membrane (200 mA, 2.5 hrs) and the wet 
20 membrane was wrapped in plastic wrap and exposed for at least 1.5 hrs. 

The regions of the randomer lanes c orrespon ding to the visible CI and C2 
complexes of the reference lanes were excised and eluted with 100 of 
elution solution (10 mM Tris, pH 8.0, 1 mM EDTA, 1 M Nad) 30 min at 
68*C. The liquid was transferred to a fresn tube and the membrane was 
25 rinsed with 100 #il TE which was added to this eluate. After pelleting the 
particulate debris, the DNA was precipitated with the addition of 10 Mg 
glycogen, 2 fd 1 M Mgd 2 and 2.5 volumes of ethanol. The pellet was 
rinsed with 70% ethanol, dried, and the recovery assessed by scintillation 
counter. The entire pellet of each sample (-29-57 pg) was resuspended in 
30 10 jd lOx PCR buffer (500 mM KC1, 100 mM Tris, pH 8.4, 1 mg/ml 
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gelatin, 15 mM MgCl^ and 32 /il water. After addition of 1 fil each of 
100 ftM Xho I primer and Xba I primer (5' GGACGATCTAGATTCG 3', 
[SEQ ID No. 5]), 5 fil of nucleotide mix (2 mM dNTPs and 4 fiM or^P- 
dCTP 800CS/mmol), and 1 U Tag polymerase the reactions were overlaid 

5 with paraffin oil and subjected to 20 cycles of PCR in an Ericomp 

machine: 2 min 94 # C, 20x (15 sec 95 # C, 15 sec 55*Q, 10 min 72 # C. 
The products were gel purified on 10% acrylamide and precipitated using 
10 fig glycogen as carrier. Recovery was measured by scintillation counter 
and after resuspension in the EMSA reaction buffer (10 mM Tris, pH 7.5, 

10 50 mM KC1, 1 mM EDTA, 1 mM Mgd 2 , and 5% glycerol) this probe 
was used for the next round of EMSA selection. Subsequent cycles were 
primarily as above, however, 50 ng of probe was used. Eight rounds of 
selection and amplification were completed for the baculovinis c-Myc and 
seven rounds for the CHO c-Myc. After the final PCR reaction the 

15 products were extracted twice with phenol, twice with ether, and 

precipitated prior to digestion with Xho I and Xba I. After gel isolation the 
appropriate fragment was subcloned into the Bluescript SK vector 
(Stratagene) and sequenced by standard procedures. 

Oligonucleotides Used : Oligonucleotide sequences that were used are 
20 shown below, with the E-Box core sequences underlined: 

SEQ ID NO. 6: 

(ME2>3 5' CXTCTCTGCA GCAGCTGG CA GCAGCTGC CA GCAGCTGC CG 3'; 
SEQ ID NO: 7: 

(*iE3) 3 5' GATCTGCAGT£&TSl^CGT£aiGiGGCGT£aXS2SGCAG 3'; 

25 SEQ ID NO: 8: 

(USE) 3 5' GATCTGCAGT££k£GJE^GCGTC&£Sl£GCGTCACSIGGCAG 3'; 
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ML.C-A 5' TCGACGTCGCAG£&SSIGCAG 3'; 
SEQ ID HO. 10: 

MLC-B 5' TCGACCCCACfiaSdSeGCGAG 3'; 

* 

SEQ ZD NO. 11: 



5 ERP1/2 5' AGCTTCGAA£&2JSCGCAGSftGJ2ISGCAGGAAGCAGGCCTA 3 

SEQ ID NO. 12: 

ERP3/4 5' AGCTTTAAAATCCCCACfiAG^TfiGCGAAGCAAfiaSSISCA 3 
SEQ ID NO. 13: 

HSE 5' AATTGCGAAACCCCTGGAATATTCCGACCTGGCAGCCTC 3'; 



i 



* . 



10 SEQ ID NO. 14: 

SMS 5' tw: & f-TTTAGA CCACGTG GTCCCCTCGA 3 



/ • 
t 



SEQ ID NO. IS: 

Randomer 5' GGACGATCTAGATTCG(N) 20 CCGTCTCGAGTATCGG 3' 



Example 2 

15 Purification of c-Mvc Protein 

A primary goal of this work was to purify and characterize Myc 
from a mammalian source. An inducible mammalian overexpression 
system that has been described previously was utilized (Wunn ex a/., Proc. 
Natl Acad. Sci. USA £5:5414-5418 (1986)). Briefly, the two coding exons 
20 of the mouse c-Myc gene under the control of a Drosophila heat shock 
promoter had been integrated and amplified in the genome of a Chinese 
hamster ovary (CHO) cell line. This overexpressing cell line, 5A, was 
adapted to spinner culture. Heat shock (43°Q induces transcription of the 
amplified myc genes while a subsequent 2 hour recovery period at normal 
««npen^ <37-C) pemutt tiaariaik^ 



2S 
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phosphoproteins of 60, 62, and 72kD which were immunoprecipitable with 
Myc-specific monoclonal antibodies (Wurm et al. 9 Proc. Natl. Acad. ScL 
USA £5:5414-5418 (1986)). The c-Myc produced was tightly associated 
with the nuclei and attempts to solubilize it using a number of detergents, 
5 salts, and reducing agents were unsuccessful (data not shown). Significant 
solubilization was achieved however with either SDS or with urea at 
concentrations greater than 4 M. For purification, the Myc was solubilized 
with 5 M urea and chromatographed on DEAE resin and FPLC Mono-Q as 
described in materials and methods. The presence of Myc in the column 
10 fractions was assayed by imxnunoblot using an antipeptide monoclonal 

antibody, 1F7 (Miyamoto et aL 9 Proc. Natl Acad. ScL USA 32:7232-7236 
(1985)). This purification procedure yielded 150/xg of c-Myc per liter of 
spinner cells (8x10 s cells). The Myc appeared to be 95% homogeneous as 
judged by silver staining (Fig. 1 A). 
15 An alternative translation start rite for c-Myc accounts for some of 

the molecular weight heterogeneity of c-Myc translated in vitro and 
expressed in several cell lines (Hann et a/., Cell 52:185-195 (1988)). This 
alternate site is upstream from the canonical start rite, however, and is not 
present in our overexpressor gene. N-terminal sequence analysis of each of 
20 the three prominent Myc bands described above revealed, as expected, the 
sequence predicted by the canonical start site (data not shown), although 
the N terminal methionine was not present, presumably because of N 
terminal processing. Therefore the potentially important differences in 
apparent molecular weight that are observed might be attributed to post- 
25 translational modifications and not N-terminal heterogeneity. 

Human c-Myc has also been purified using the baculovims 
ove x e xpress ion system. For purification, Sf9 cells that had been infected 
with recombinant virus were harvested just prior to the onset of lysis (—48 
hours post infection). Myc produced using the baculovirus system has been 
30 previously reported to be both phosphorylated and tightly associated with 
the nucleus (Miyamoto et a/., Mol CeU. Biol. 5:2860-2865 (1985)). 
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Solubilization and purification were carried out as with the CHO produced 
Myc resulting in a yield of 2.5 mg/8xl0* cells. Myc purified from these 
insect cells was apparently homogeneous by silver staining, and ran as a 
single diffuse band of ~60kD (Fig. IB). This was in contrast to the 
5 multiple bands observed with mammalian Myc by immunoblot (Fig. IB). 

Discussion 

Myc was purified to near homogeneity from overexpressing 
mammalian cells and baculovirus infected cells. The mammalian derived 
protein appears to be highly modified in contrast to Myc expressed in and 

10 purified from insect cells. Up to 19 distinct species of c-Myc can be 

identified by two dimensional gel electrophoresis (Fig. 1). These species 
differ both in size (approximate MRs of 60,000, 62,000 and 72,000, 
although this estimate of size can vary with different gel conditions) and in 
pi. These differences in pi might in part be attributed to differences in 

15 phosphorylation, as c-Myc is known to be phosphorylated and the change 
in pi of the species is consistent with incremental additions of phosphate. 
Although the Myc produced by the baculovirus ov crexpr ession system does 
not demonstrate the same molecular weight heterogeneity as the mammalian 
protein, it too is phosphorylated (Miyamoto ex al. 9 MoL Cell. BioL 5:2860- 

20 286S (1985)). The specific sites of phosphorylation have not been 
determined for either Myc preparation and other as yet unidentified 
modifications may distinguish these two Myc preparations. 

Example 3 

S pecific DNA Bi nding Activity Present in Purified c-Mvc 

25 The presence of a B-HLH domain in c-Myc suggested that it would 

bind to an E-Box-like sequence of the general pattern CANNTG. These 
sites were first identified in immunoglobulin enhancers but have since been 
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found in many other tissue specific enhancers. It was first determined if 
any of these would be bound by the purified c-Myc proteins described in 
Example 2. A large number of E box related sequences were screened by 
electrophoretic mobility shift assays (EMS A). Those shown in Fig. 2 
5 include synthetic oligonucleotides containing trimers of either the mE2 

(CAGCTG) or §£E3 (CATGTG) sites of the immunoglobulin enhancer and a 
trimer of the Adenovirus mayor late promoter upstream element (USE) 
(CACGTG). Two sites from the myosin light chain (MLQ enhancer are 
also shown: the A site (CAGGTG) which resembles the kE2 
10 immunoglobulin enhancer site, and the B site (CAGCTG) which has the 
same core sequence as the pE2 site. The heat shock element (USE) served 
as a control since its sequence does not resemble an E-Box core. 

Three specific binding activities were detected in this assay forming 
complexes referred to as CI (USE specific), C2 (USE specific), and C2' 
IS 0iE2 specific). As demonstrated below, despite the comigration of C2 and 
C2', these represent separate complexes based on observed differences in 
protein composition as well as binding specificity. The data presented 
argue that the CI complexes are formed by homo-oligomers of Myc while 
formation of the C2 and C2' complexes each require an additional protein. 
20 The slowly migrating complex (CI) formed most readily on the USE (Fig. 
2, lanes 5, 11, and 12), less well on the similar ftE3 site (Fig. 2, lanes 2 
and 8), and not at all on the other E-Box and non-E-Box sites tested. CHO 
and baculovirus Myc preparations were similar with regard to the CI 
complex, however they differed with regard to the faster migrating 
25 complexes. In the mammalian Myc assays the C2 ' complex formed on the 
r V7 site of the immunoglobulin enhancer and the is /xE2-like sequence of 
the MLC-B site (Fig. 2, lanes 1 and 4). Baculovirus Myc contained no 
binding activity with this specificity (Fig. 2, lanes 7 and 10). In contrast, 
formation of the C2 complex was detected using either Myc preparation. 
30 The C2 complex formed most readily on the USE site (Fig. 2 t lanes 5, 11, 
and 12) and less well on the similar mE3 sequence (Fig. 2, lanes 2 and 8). 
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Very little if any binding was detected on the kE2-like sequences (MLC-A 
Fig. 2, lanes 3 and 9, and /*ES, data not shown). No specific binding was 
found on non-E-Box sequences such as the USE (Fig. 2, lane 6 and 13). 
Competition experiments were pe rf ormed on the three binding 
5 activities CI, C2, and C2' to further characterize their specificity (data not 
shown). In experiments using pE2, pE3, USE, fiE5 r or HSE sequences as 
competitors, competition of the C2 * complex formed on the §tB2 probes 
was most easily achieved with the ftE2 oligos while the C2 complexes were 
preferentially competed by the USE sequence. The CI complex was also 
10 competed most efficiently by the USE sequence. A detailed analysis of the 
binding specificities of these complexes is presented below. 

Example 4 

Proteins Responsible for Formation of CI. C2. and C2' Complexes 

One scenario suggested by the differences in binding is that Myc 
IS might not be the only protein involved in formation of the three complexes. 
To distinguish the role of c-Myc from other copurifying proteins in the 
formation of the observed complexes cut and renature experiments were 
performed as follows. Preparative amounts of Myc were separated by 
SDS-PAGE. Proteins were electroeluted from various molecular weight 
20 slices, precipitated, solubilized in guanidine-hydrochloride and dialyzed to 
renature for analysis by EMSA. The CI complex binding activity may be 
renatured from the Myc containing slices of either baculovims or 
mammalian preparations (Fig. 3) while no other slices from the entire gel 
contained CI activity (data not shown). These data argue that Myc alone is 
25 the protein responsible for the CI complex, and that full length Myc 

protein as expressed in eukaryotic cells can bind specifically to sites with 
the core sequence CACGTG. 

Analysis of the proteins responsible for formation of the C2 and 
C2 * complexes was achieved with additional cut and renature experiments 
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performed as described above. EMS A using the USE probe revealed no 
single slice from GHO or baculovinis preparations which contained 
detectable C2 binding activity (data not shown). However, this activity 
was recovered by renaturing proteins from a 26-29 kD slice together with 
5 proteins in the 60-70 kD Myc containing slice (Fig. 4, lanes 1*8). The 26- 
29 kD component was present in gels loaded with either GHO or 
baculovinis produced c-Myc, and, when renatured with Myc, demonstrated 
the same specificity as the C2 complex in the loaded material. R maturation 
of the 26-29 kD slice with BSA or protein A did not yield USE binding 

10 activity suggesting that Myc plays a specific role in the recovery of C2 
binding activity. 

To examine further the roles of copurifying proteins and of Myc 
modifications in the observed binding, Myc was also purified from a 
bacterial overexpression system. The expression system and purification 

IS method used were those of Chi Dang and colleagues (see m*t-ri*u and 
methods). The bacterially produced protein contains the IgG binding 
segment of protein A fused to the C-tenninal 85 amino acids of Myc, the 
segment of Myc which contains the B-HLH and leucine zipper motifs. For 
many of the B-HLH proteins, the small region of the protein containing the 

20 B-HLH motif is not only necessary but fully sufficient for DNA binding if 
the correct oligomerization partner is present. This protein was able to 
form the CI complex on the USE probes (Fig. 4 t lane 9) and to combine 
with the 26-29 kD factor to create the C2 complex (Fig. 4, lane 10). 
Competition experiments confirmed the specificity of this reconstituted C2 

25 complex. The CI and C2 complexes formed using this bacterial fusion 
protein migrated more rapidly than those formed using full length c-Myc 
(compare Fig. 4, lane 8 with lanes 9 and 10). This may be due to the 
difference in size between the full length c-Myc (60-72 kD) and the protein 
A-Myc fusion protein ( — 38 kD) and therefore the mobility of the C2 

30 complex may be interpreted as an indication that Myc is physically present 
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in the C2 complex, presumably as part of a hetero-oligomer with the 26-29 
kD factor. 

Analogous e xperi ments were carried out using a similar bacterial 
fusion protein containing only the C-tenninal 67 amino acids of c-Myc. 

5 This protein contains most of die HLH domain and the entire leucine zipper 
domain but no base region. Although this protein is capable of forming 
homo-oligomers in solution (Gentz ex aL 9 Science 243:1695-1699 (1989)), 
it was unable to bind to DNA to form the CI complex and was also unable 
to combine with the 26-29 kD factor to create any USE binding activity 

10 (Fig. 4, lane 12). These data argue that the role of Myc in the C2 hetero- 
oligomer requires an intact basic region, the region responsible for specific 
DNA contacts in other B-HLH proteins. 

Using cut ami renature experiments the pE2 binding activity 
responsible for the C2* complex was able to be identified. A small amount 

IS of the C2 ' complex was frequently seen with proteins from the slice 

encompassing the 40-50 kD molecular weight range of mammalian Myc 
preparations (Fig. 5 A). Although no C2' complex was ever seen with the 
Myc containing slice alone, renaturation of the protein from the Myc slice 
with the 40-50 kD slice reproducibly increased the amount of C2 ' complex 

20 formed. Both the baculovirus produced Myc and the bacterially expressed 
fusion protein containing the basic region, which do not form complexes 
themselves on jiE2 probes, were also able to increase the amount of 
complex formed by the 40-50 kD slice obtained from mammalian 
preparations (Fig. 5B and Q. Surprisingly the bacterially produced Myc 

25 lacking the basic region could also reconstitute C2* activity, while various 
other proteins tried including BSA, immunoglobulins, and protein A could 
not. The apparent lack of a role for this basic region suggests that Myc' s 
involvement in formation of this complex may be other than contacting DNA. 
To further determine whether Myc was present in the analyzed 

30 complexes, the Myc preparations were incubated with a Myc-specific 

monoclonal antibody prior to EMSA. The probe used in this experiment 
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(SMS) contained a single site with the USE core sequence, CACGTG. The 
Myc-specific antibody eliminated both the CI and C2 complexes and 
produced a prominent complex of slower mobility (Fig. 6). It is not clear 
from these data which of the two complexes was supershifted but the 
5 presence of one predominant shifted complex when antibody is present and 
two complexes in the absence of antibody argues that the Myc-specific 
antibody also completely disrupted one of the original complexes. There 
was no effect of a control monoclonal antibody on the formation of either 
the CI or C2 complex. The Myc-specific antibody did not alter the C2 * 

10 complex, suggesting that Myc is not present in this complex. 

From these experiments it can be concluded that the CI complex is 
formed by Myc alone, that die C2 complex contains Myc and a 26-29 kd 
factor and that the C2' complex contains a 40-50 kd factor but does not 
contain Myc. It is intriguing that the C2' complex requires the presence of 

IS Myc for formation, but apparently does not contain Myc. Myc therefore 
appears capable of affecting the 40-50 kd factor's ability to form the C2 ' 
complex without being a member of the complex. Whatever the 
mechanism, the increase in pE2 binding activity of the 40-50 kD factor 
appears to be Myc-specific since four different Myc proteins increase " *h* 

20 amount of C2' complex observed while several other proteins did r* 

Max protein can be immunoprecipitated from avian and hum&, wi*.* 
and low stringency Southern analysis has suggested that a single Max gene 
or a small family of genes exist in other vertebrates as well (Blackwood 
and Eisenmann, Science 251:1211-1217 (1991)). It is possible that hamster 

25 and insect cells have an equivalent of Max. The recovery of a Max-like 
activity from insect cells is particularly interesting since no Myc homologs 
have been found in insects to date. Drosophila clearly uses B-HLH 
heterodimers to regulate aspects of development and the possibility remains 
that the natural partner for the 26-29 kD protein in insect cells is an as yet 

30 unidentified B-HLH protein which functions like Myc. 
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Thc presence of the 26-29 kD factor in these preparations might 
limit their usefulness for certain experiments. By pooling Myc containing 
fractions based on an EMS A assay, one may obtain fractions that contain 
only the CI activity and that do not contain the C2 activity, although this 
5 modification reduces the final yield by approximately 80%. 

Examples 

Selection of Bindine Sites For Mvc From Random DNA Sequences 

In order to determine the optimal binding sites for the three 
complexes in the Myc preparations described above, a modification of a 

10 recently described technique for isolating preferred binding sites from large 
pools of randomized DNA sequences was used (Pollock and Treisman, 
Nucl Acids Res. 78:6197-6204 (1990)). Briefly, a pool of double stranded 
oligonucleotides was created that consisted of 16 base flanking regions of 
defined sequence surrounding a 20 base region of completely random 

IS sequence. Each of the eukaryotic Myc preparations described above was 
mixed with this pool of sequences and the protein DNA complexes that 
formed were separated by EMSA. The DNA that ran at the position of the 
CI or C2 (and co migrating C2') complexes was isolated, amplified by the 
polymerase chain reaction (PCR), and used in a second round of EMSA 

20 selection. Either seven (CHO preparation) or eight (baculovirus 

preparation) rounds of selection in total were performed before subcloning 
individual members of the selected sequences. As each round was expected 
to enrich for better binding sites, the final subcloned oligonucleotides were 
expected to contain high affinity binding sites for the CI, C2, and C2 * 

25 complexes. In addition, such a procedure should give some indication of 
the relative stringency of selection for a given base at a particular position 
within the binding site consensus. 

The selected sequences were placed in three separate groups for 
analysis (Fig. 7). Group I contains sequences that were selected by the CI 
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complcx from either mammalian or baculovirus preparations. These 
sequences were pooled for analysis because with both preparations 
formation of the CI complex requires only Myc protein, and because the 
two sets of sequences (that isolated with mammalian Myc and that isolated 

5 with baculovirus Myc) were similar to each other. Most of the selected 
sequences in this group contained the sequence CACGTG (21/27 of 
sequenced subclones). By aligning all of the sequences that contained this 
central core sequence, it was found that the sequences flanking this core 
were also nonrandom. A 12 base consensus sequence of 

10 GACCACGTGCTC [SEQ ID. No, 1] was determined for sites selected by 
the CI complex (see table in Fig. 7 for frequencies at each position; for a 
base to be included in the consensus it had to be found in at least 10 out of 
the 21 sequences with a CACGTG core). The C2 complex from 
baculovirus preparations selected sequences similar to those selected by the 

IS CI complex (Fig. 7, Group II). Most of these selected sequences also 
contained the CACGTG core (19/22). These sequences had similar 
flanking sequences adjacent to the core hexamer to those found with the CI 
complex, although there was a slight preference for GCC over CTC in the 
3 * flank (see table for Group II in Fig. 7). 

20 As expected, complexes running at the position of C2 that were 

selected by the mammalian Myc preparations had a greater diversity of 
sequences (Fig. 7, Group HI). Several sequences (8/36) contained the 
CACGTG core. These sequences were presumably selected by the 
mammalian C2 complex (comprised of Myc and the 26-29 kd factor) and 

25 demonstrated the same flank preferences as the CI complex. Several other 
selected sites (9/36) contained a CAGCTG core sequence presumably 
selected by the C2 * complex. In addition, 8 of the 36 sequences were very 
AT rich, and many of the sequences in all three groups contained AT rich 
stretches. This enrichment for AT rich sequences might reflect a 

30 preference of Myc for these sequences, or instead might simply indicate a 
bias arising from the protocol used. It is interesting to note, however, that 
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in previous filter binding experiments, the mammalian Myc preparation has 
demonstrated a preference for binding AT rich sequences within various 
plasmids or lambda genomic DNA. 

To confirm die validity of our site selection procedure a number of 
5 the selected sites individually by EMSA (Fig. 8) were tested. As expected, 
it was found that sequences containing the core CACGTG formed both the 
CI and C2 complexes (Kg. 8, probe groups 1, 2, 5, and 6) while 
sequences containing the CAGCTG core formed only the C2* complex 
(Fig. 8, probe groups 7 and 8). Note that the C2* activity is only present 
10 in the CHO derived Myc preparations. No complex formed when selected 
sequences that did not contain a canonical E box core were tested (Fig. 8 
probe groups 3, 4, 9, 10, and 11). These latter sequences, therefore, do 
not represent high affinity sites for proteins in the Myc preparations. 

Example 6 

15 Off Rate Of The CI And C2 Comntera 

Off-rates for the Myc containing complexes were measured as a 
means of comparing their affinities. The off-rate of the C2 complex 
formed on the USE probe was approximately 1-2 minutes (Fig. 9, 
baculovirus Myc; similar results were obtained with CHO Myc, data not 

20 shown). The CI complex was not fully competed in this e xperi ment using 
250 fold excess of USE competitor. Although competition was not 
complete, the amount of CI complex remaining at the earliest measurable 
timepoint ("0") was significantly less than the starting amount and virtually 
equal to the maximum competition achieved in these experiments. These 

25 data are indicative of an abundant weakly binding protein with an 
immeasurably fast off-rate. Therefore Myc alone appears to bind 
significantly more weakly than does Myc and the 26-29 kd factor. 
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Identification of an Inhibitor of c-Myc 

C2 Complex Activity in Yeast Cells 

Yeast host cells are transformed with plasmids carrying a c-Myc 
expression vector (host V); or the c-Myc expression vector and a 26-29 
5 kilodalton C2 complex protein identified as above (host 'b*)« In addition 

all yeast strains are ^transformed with a plasmid that contains the coding 
sequence for 0-galactosidase operably-linked to the CACGTG sequence 
motif as described above. 

A lawn of each of the transformed yeast strains is spread on agar 
10 plates containing X-gal in the medium and small filter disks containing 

compound W, X, Y, or Z are placed on the lawns. The yeast are allowed 
to grow and the plates are monitored for colony growth and colony color 
by vism 1 observation. Typical results from such an experiment are shown 
in Table 1. 
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Table 1: Identification of Inhibitors of C2 Complex Activity 



# 

Compound Yeast Colony Color from 

Growth 0-gal Assay 
withX-gal 



none 


a 


+ 


White 




b 


- + 


Blue 


w 


a 


+ 


White 




b 


+ 


White 


X 


a 








b 






Y 


a 


+ 


White 




b 


+ 


Blue 


z 


a 


+ 


Blue 




b 


+ 


Blue 



The results of the above table indicate that compound W prevents 
20 the induction of /3-galactosidease in the V host cells. Therefore, 

compound W is an inhibitor of C2 complex hetero-oligomer formation and 
an inhibitor of c-Myc biological activity. Compound X inhibits the 
growth of yeast per se and thus would not be a compound of interest. 

Compound Y does not prevent induction of /Srgalactosidase activity 
25 in the f b* host cells. Therefore, compound Y is not an inhibitor of C2 
complex hetero-oligomer formation. 

Compound Z shows an interesting effect of inducing 0-galactosidase 
activity in the V host cells which does not contain the C2 complex protein 
used in the 'b' hosts, rather than preventing heterooligomer formation. 
30 This suggests that compound Z may induce synthesis of a partner protein 
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which is not otherwise present in the yeast host cells or that it may be (or 
mimic) such a protein. 

From these results, compound W would be identified as an inhibitor 
of C2 complex formation and/or DNA binding and thus of c-Myc 
transcriptional activity in vivo. 

Example 8 

Identificatio n of an Inhibitor of c-Mvc 
Complex Activity in Yeast Cells 



Yeast host cells are transformed with two plasmids, each plasmid 
10 carrying a C2 * complex expression vector encoding at least one 40-50 
kilodalton C2* peptide (host V); or the c-Myc expression vector in 
addition to the vectors encoding the C2 * complex proteins identified as 
above (host 'b'). In addition all yeast strains are cotransformed with a 
plasmid that contains the coding sequence for 0-galactosidase operably- 
15 linked to the CAGCTG sequence motif as described above. 

A lawn of each of the transformed yeast strains is spread on agar 
plates containing X-gal in the medium and small filter disks containing 
compound W, X, Y, or Z are placed on the lawns. The yeast are allowed 
to grow and the plates are monitored for colony growth and colony color 
20 by visual observation. Typical results from such an experiment arc shown 
in Table 1. 
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Tablc 2: Identification of Inhibitors of C2 ' Complex Activity 



Compound Yeast Colony Color from 

5 Growth 0-gal Assay 

with X-gal 



none 


a 


+ 


White 




b 


" + 


Blue 


W 


a 


+ 


White 




b 


+ 


White 


X 


a 








b 






Y 


a 


4- 


White 




b 




Blue 


Z 


a 


+ 


Blue 




b 


+ 


Blue 



The results of the above table indicate that compound W prevents 
20 the induction of 0-galactosidase in the 'b* host cells. Therefore, compound 
W is an inhibitor of C2 * complex hetero-oligomer formation and an 
inhibitor of the c-Myc biological activity that is directed towards promoting 
such C2 ' complex hetero-oligomer formation. Compound X inhibits the 
growth of yeast perse and thus would not be a compound of interest. 
25 Compound Y does not prevent induction of 0-galactosidase activity 

in the *b' host cells. Therefore, compound Y is not an inhibitor of C2 
complex hetero-oligomer formation. 

Compound Z shows an interesting effect of inducing 0-galactosidase 
activity in the V host cells which does not contain the Myc protein used 
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in the f b* hosts, rather than preventing hetero-oligomer fonnation. This 
suggests that compound Z may induce synthesis of a protein that can 
substitute for Myc in promoting formation of the C2' complex which is not 
otherwise present in the yeast host cells or that it may be (or mimic) such a 
5 protein. 

From these results, compound W would be identified as an inhibitor 
of C2* complex fonnation and/or DNa binding activity and thus of c-Myc 
transcriptional activity in vivo. 

10 All references cited herein are fully incorporated by reference. 

Having now fully described the invention, it will be understood by those 
with skill in the art that the scope may be performed within a wide and 
equivalent range of conditions, parameters and the like, without affecting 
the spirit or scope of the invention or any embodiment thereof. 
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SEQUENCE LISTING 



<1) GENERAL INFORMATION: 

(i) APPLICANTS Kingston, Robert E 
Papoulas, Ophelia 

(ii) TITLE OF INVENTIONS C-MYC DNA BINDING PARTNERS, 
MOTIFS, SCREENING ASSAYS , AND USES THEREOF 

(iii) NUMBER OF SEQUENCERS 101 

4 

(iv) CORRESPONDENCE ADDRESS s 

(A) ADDRES SEE x Sterne, Kessler, Goldstein £ Fox 

(B) STREET: 122S Connecticut Arenue, N»W. , Suite 300 

(C) CXTTs Washington 

(D) STATES DC 

(E) COUNTRY s USA 
<F) MPs 20036 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPES Floppy disk 
(B> COMPUTERS IBM PC compatible 

(C) OPERATING SYSTEMS PC-DOS /MS-DOS 

(D) SOFTWARES Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATAs 

(A) APPLICATION NUMBER x US 

(B) FILING DATES 

(C) CLASSIFICATIONS 

(▼Hi) AT T ORNEY /AGENT INFORMATIONS 

(A) RAMEs Gimbals, Hichele A 

(B) REGISTRATION NUMBERS 33,851 

(C) REFERENCE /DOCKET NUMBERS 0609.3440004 

(ix) TELECOMMUNICATION INFORMATION s 

(A) TELEPHONES (202)833-7533 

(B) TELEFAX s (202)833-8716 



(2) INFORMATION FOR SEQ ID NOsls 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH : 12 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDED NESS s single 
(0) TOPOLOGY s linear 

(ii) MOLECULE TYPEs DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOsls 
GAOCACGTGC TC 

(2) INFORMATION FOR SEQ ID NOs2s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 12 base pairs 

(B ) TYPEs nucleic acid 



CI tOC Tft I ITC CUCFT 
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(C) STRAND ED NESS : single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GACCACGTGG TC 12 
(2) INFORMATION FOR SEQ ID NO:3: 

4 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 12 bAM pairs 

(B) TYPES nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: llnear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 
AACAGTYCTG TT 

(2) INFORMATION FOR SEQ ID NOs4s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 but pairs 

(B) TYP Es nacleic sold 

CD) TOPOLOGY s linear 
(11) MOLECULE TYPES DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CCGATATCTC GAGACGG 17 
(2) INFORMATION FOR SEQ ID NOsS: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 bass pairs 

(B) TYPE: nuclsic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID* NO: 5: 
GGACGATCTA GATTCG 16 
(2) INFORMATION FOR SEQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND EDNESS i single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GATCTCTCCA CCACCTGCCA CCACCTCGCA CCAGCTCCCG 40 
(2) INFORMATION FOR SEQ ID NO:7: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : a ingle 

(D) topology x linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GATCTGCAGT CATGTGGCGT CATGTGGOGT CATGTGGCAG 40 
(2) INFORMATION FOR SEQ ZD NO: 8: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pair* 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : a Ingle 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPES DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GATCTGCAGT CACGT G GOGT CACGTGGCGT CACGTGGCAG 40 
(2) INFORMATION FOR SEQ ID NO: 9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDED NESS : single 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TOGAOGTOGC AGCAGOTGCA G 21 
(2) INFORMATION FOR SEQ ID NO: 10: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 



BNSDOCID: <WO 9308701 A 1_IA> 



ci incTrn itc cucrr 



WO 93/08701 



PCT/US92/08603 



- 56 - 



(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii> MOLECULE TYPE: ON A 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TCCACCCCAC CAGCTGCCGA G 
<2) INFORMATION FOR SEQ ID pO:ll: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 but pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AQ CT TC G AAC ACCTGCAGCA GCTGOCAGGA AGCAGGCCTA 

(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION* SEQ ID NO: 12: 
ACCTTTAAAA TCCCCACCAG CTGGCGAAGC AACACGTCCA 
(2) INFORMATION FOR SEQ ID NO: 13: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
AATTCCGAAA COCCTGOAAT ATTOOGAOCT GGCAGCCTC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION : SEQ 10 NO: 14: 
TCCACTTTAC ACCACCTCCT CCCCTCGA 
(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH* 52 base pairs 

(B) TTP Es nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY* linear 

(11) MOLECULE TYPES DNA 



(xl) SEQUENCE DESCRIPTION t SEQ ID NO: 15: 
GGAOGATCTA GATTOGNNNN NNNNNNNNNN NNNNNNCOGT CTOCAGTATC GG 52 
(2) INFORMATION FOR SEQ XD NO 2 16: 

(1) SEQUENCE CH ARACTERISTICS: 

(A) LENG TH i 10 base pairs 

(B) TYP E: nucleic acid 

(C) STRANDEDNESS s single* 
CD) TOPOLOGY x linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
NNCANNTGNN 10 
(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPES DNA 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CCAGAATCTA OCAOOT GC TC C 21 
(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 



BNSOOCID: <WO 930870lAl_IA> 



SUBSTITUTE SMFPT 



WO 93/08701 PCT/US92/08603 

- 58 - 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE I DNA 

(xi) SEQUENCE DESCRIPTION : SEQ 10 NO: 18: 
CGCCCTACCA OCTCCTTATC 20 

(2) INFORMATION FOR SEQ ID NO: 19 1 

# 

(1) SEQUENCE CHARACTERISTICS 2 

(A) LENGT H: 20 biM pairs 

(B) TYP E: nucleic add 

(C) STRANDEDNESS : single 

(D) TOPOLOGY « linear 

(11) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GGACGAAAGC ACGTCCTCOG 20 
(2) INFORMATION FOR SEQ ID NO: 20: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY: linear 

(U) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GCACATGACC ACGTGCTCTG 20 
(2) INFORMATION FOR SEQ ID NO:21: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
GGCACAGACA CGTGCCCTGG 20 
(2) INFORMATION FOR SEQ ID NO: 22: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

GGGAAACGAC CTCTTATCTG 20 

(2) INFORMATION FOR SEQ ID NO: 23: 

(1) SEQUENCE CHARACTERISTICS : 
(A) LEN GTH: 22 bilt pairs 
(8) TYPSs nucleic add 

(C) STRANDED NESS : a ingle 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
OGACGAOGTG CTCTTCGACT TG 22 
(2) INFORMATION FOR SEQ ID NO: 24: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 baee pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NES S : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GCACAATTTG TACCACGTGG CCG 23 
(2) INFORMATION FOR SEQ ID NO: 25: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GGACAACATC GACGACGTGG CCG 23 
(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 baee pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 
<D) TOPOLOGY r linear 

<ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:26: 
GCCTGCATGA CCACGTGGAC C 

(2) INFORMATION FOR SEQ ID NO*27: 

• 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 21 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GCAAATATGA CCAC G TG G TA C 
(2) INFORMATION FOR SEQ ID NO: 26: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENG TH* 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS i single 
CO} TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GGACCACCTC CTCTTTTCTG 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GGCATAAACT CGAOCTGGTC C 21 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 30: 



OGGGCACGTG CTCCTCGGAC TC 



22 



(2) INFORMATION FOR SEQ ID,NO:31: 



<i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TTPE: nucleic acid 



(C) STRANDEDNESS t single 

(D) TOPOLOGY: linear 



<il) 



MOLECULE TYPE: DNA 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



CCTAGCAAAA AGCACGTCCC CG 



22 



(2) INFORMATION FOR SEQ ZD NO: 32: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 bass pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GGCCCATTTA ACCACGTGCT CC 22 



(2) INFORMATION FOR SEQ ID NO: 33: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GAGCTATTAA CCAOGTGGTA C 21 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GACCACCCGG CATCCACGTG COCT 24 
(2) INFORMATION FOR SEQ ZD NOx3Sx 



(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTHS 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY s linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION x SEQ ID NO: 35: 
GCGGACGACG TGCTCG G TTG 20 
(2) INFORMATION FOR SEQ ZD NOx36x 



(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTHS 22 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS X Single 

(D) TOPOLOGY x linear 

(11) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CACATATTAG ACCACCTGCT CC 22 
(2) INFORMATION FOR SEQ ID NO: 37: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH x 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 



(11) MOLECULE TYPES DNA 



(Xl) SEQUENCE DESCRIPTION : SEQ ID NO: 37: 

CGGOCAOGTG CTCACTGTCT ACC 23 

(2) INFORMATION FOR SEQ ID NO:38: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : e ingle 
(O) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GGATGGAGAG CTTCTTCCTG 



(2) INFORMATION FOR SEQ lb NO: 39: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 but pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY* linear 

(ii) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GCAATCCCCC CCTGCTCGCC 



(2) INFORMATION FOR SEQ ID NO: 40: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2i base pairs 

(B) TYPE: nucleic acid 

(C) .STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(Xl) SEQUENCE DESCRIPTION: SEQ ID-NO:40: 
GCCAAAAATG TACAGCTGTC CC 22 
(2) INFORMATION FOR SEQ ID NO: 41: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

CGGOCAOGAG CTCATCAATG TGC 23 

(2) INFORMATION FOR SEQ ID NO: 42: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 : 
GCAGGCTCTA CCTCACTTCC 20 
(2) INFORMATION FOR SEQ ID ta>:43: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 btft pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43: 
COGCAGTCCT GGTGCTCTGC 20 
(2) INFORMATION FOR SEQ ZD NO:44: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bass pairs 

(B) TYP E: nucleic acid 

(C) STRAKDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4*: 
CACTAAGAAA TACCACGTGG CCG 23 
(2) INFORMATION FOR SEQ ID NO: 45: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 bass pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

GGGGATTTAA CCACCTGCTC C 21 

(2) INFORMATION FOR SEQ ID NO: 46: 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ 10 HO: 46: 
OGCCCAOGTG CCTTCTTTCT CCC 23 
(2) INFORMATION FOR SEQ ID *NO S 47: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DMA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CATAGTOGAG AGAGCACCTG OGAA 24 
(2) INFORMATION FOR SEQ ID NO: 48 x 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 bass pairs 

(B) TYPE: nucleic acid 

(C) .STR AN DED NES S * single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GATAAGTGAG ACCACGTGCC CG 22 
(2) INFORMATION FOR SEQ ID NO: 49: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: DMA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

CCCAACTAAG ACCACGTGCC CG 22 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CACTCCAAGA GGCCACGTCG CCA 23 
(2) INFORMATION FOR SEQ ID HO: 51: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGT H: 23 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: SI: 
OGTAGGTTAT TCCCAC G T GC COG 23 
(2) INFORMATION FOR SEQ ID NO: 52: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS* .single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
CATAAATAGG CCACGTGCTC C 21 
(2) INFORMATION FOR SEQ ID NO: 53: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

GGAAAATGTA CCACGTGCTC C 21 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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( B ) TYPE: nucleic acid 

(C) STRANDEDMESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
GGAACAGACC ACCTCCCTTC 20 
(2) INFORMATION FOR SEQ ID' NO: 55: 

(1) SEQUENCE CHARACTERISTICS x 

(A) LENGTH z 20 baee pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESSs single 

(D) TOPOLOGY t linear 

(11) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:SS: 
GTACCAOCTG CTTTTTTGCC 
(2) INFORMATION FOR SEQ ID NO:S6: 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 22 baee pairs 

(B) TYPEs nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
CAGTCCGAGG AGCACGTGCC CG 
(2) INFORMATION FOR SEQ ID NO: 57: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION : SEQ ID NO: 57: 

CGGCCAOGTG TOGAGCATCA GTC 23 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:S8: 
CGGCCACGTG CTCCTAAATT TCC 23 

(2) INFORMATION FOR SEQ 10 NO* 59: 

(I) SEQUENCE CHARACTERISTICS : 

(A) LENGTH i 23 but pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 59 : 
GCGACAAAAT TACCACCTCC COG 23 
(2) INFORMATION FOR SEQ ID NO:60: 

(1) SE Q UENC E CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
CGCAAAATCG ACCACGTGGT CC 22 
(2) INFORMATION FOR SEQ 10 NO:61: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 61: 
GCATAAGTAA TACCAOGTCG CCC 23 
(2) INFORMATION FOR SEQ ID NO: 62: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



<li) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 62: 

CAOGTGCTCC 20 
(2) INFORMATION FOR SEQ ID MO: 63: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 Ui« pairs 

(B) TYPE: nucleic add 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID HO: 63: 
GCGGCCGGAA CTCOCTTGTC 20 



(2) INFORMATION FOR SEQ ID NO: 64: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 



(11) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
CGGGACCCCA TCTCTCGCTC 20 



(2) INFORMATION FOR SEQ ID NO: 65: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 

(C) STRAND EDNESS x single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO:6S: 
CAATAATATT TCCTTTOCTC 20 
(2) INFORMATION FOR SEQ ID NO: 66: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

<xi) SEQUENCE DESCRIPTION X SEQ ID NO: 66: 
GTOCAOGOGG CATCCACGTG COST 
(2) INFORMATION FOR SEQ ID NO: 67: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTHS 23 bate pairs 

(B) TYPE: nucleic tcld 

(C) STRANDEDNESS s single 

(D) TOPOLOGY s linear 

Cii) MOLECULE TYPE: DNA 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
OGGCCACGTC CTCTATAGAT GCC 
(2) INFORMATION FOR SEQ ZD NO: 68: 

(1) SEQUENCE CHARAC TER ISTICS > 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS * single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GGACCACGTG CTTATCTTTG 
(2) INFORMATION FOR SEQ ID NO: 69: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 22 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY s linear 



24 



23 



20 



(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 69: 
CGACCAOGTG TTCCGCTACT CG 
(2) INFORMATION FOR SEQ ID NO: 70: 
(i) SEQUENCE CHARACTERISTICS: 



22 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CGAGTAGCGA GCACGTGTTG C 
(2) INFORMATION FOR SEQ ID NO:71x 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
GCACCACOTG CTTACCATGT C 
(2) INFORMATION FOR SEQ ID NO: 72: 

(1) SEQUENCE CHARACTERISTICS x 

(A) LENGTHS 20 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 
GGACAAAAAG CACGTGCTAC 
(2) INFORMATION FOR SEQ ID NOs73: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 

(11) MOLECULE TYPES DNA 



(xl) SEQUENCE DESCRIPTION s SEQ ID NO:73s. 
GCAAAACTCC ACGTGGTOGG 
(2) INFORMATION FOR SEQ ID NO: 74: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

( C ) STRAND EONES S: e i ng 1 e 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GGGCAAAAAC AACAGCTCTC OG 22 

4 

(2) INFORMATION FOR SEQ ±D NO:75: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:7S: 
GGGAAAGAGA TCAGCT OTG C G 21 
(2) INFORMATION FOR SEQ ID NO: 76: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TY PE : nucleic acid 

(C) STRANDED NESS J single 
CD) TOPOLOGY t linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
GGAGAATTGA ACAGCTGACC C 21 
(2) INFORMATION FOR SEQ ID NO: 77: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: DNA 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO:77: 
GGGACAAACC AGTCAGCTGG CCG 23 
(2) INFORMATION FOR SEQ ID NO: 78: 
(1) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 20 base pairs 

( B) TYPE; nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: DNA 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 2 

GGGCACAGCT CTTTACTGGG 20 

(2) INFORMATION FOR SEQ ID NO: 79: 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 20 base pairs 
(8) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:79: 
CGCAAGCGCA CAGCTGTTCC 
(2) INFORMATION FOR SEQ ID N0:80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
GGCATTGATC AGCTCT G TGG 
(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS x single 
<D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION : SBQ ID MO: 81: 
GGAAAAACGA GCTCGTCCCC 
(2) INFORMATION FOR SEQ ID NO: 82: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE i nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: ONA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
CGCAAGTGTA ACAGCTGGTG C 21 
(2) INFORMATION FOR SEQ XD NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

<il) MOLECULE TYPE: DNA 

(xl) SEQUENCE DESCRIPTION x SEQ ID NO: 83: 
CGAT GG TTT T TTTTTTGTAC 20 
(2) INFORMATION FOR SEQ ID NO: 84: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TTPEs nucleic add 

(C) STRAND ED NESS t single 
<D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
GCATCATTTT C T TTTTC TCC 20 
(2) INFORMATION FOR SEQ ID NO: 85: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) .TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8S: 
CACA O TIT II TTGAGCCCCC 20 
(2) INFORMATION FOR SEQ ID NO: 86: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GCAAAAAATA AAAATACATC 20 
(2) INFORMATION FOR SEQ ID NOx87: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic meld 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
GGCAAAAAAC TCAAAATACG 



(2) INFORMATION FOR SEQ ID NO: 88: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 bait pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

CCACAATAAA AAACTTTGCG 

(2) INFORMATION FOR SEQ ID NO: 89: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:89: 
CCATATGTTC ATTCTTGTCC 
(2) INFORMATION FOR SEQ ID NO: 90: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: ONA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CACAAAAATT TAGTCTGTGC 20 
(2) INFORMATION FOR SEQ ID NO: 91: 

(1) SEQUENCE CHARACTERISTICS x 

(A) LENGT H: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY x linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
CGCCCCCGTG CTCTAGCCCA TCC 23 
(2) INFORMATION FOR SEQ ID NO: 92: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYFEs nucleic acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:92: 
CGGGGAAGTC CCAAGTGCCC C 21 
(2) INFORMATION FOR SEQ ID NO: 93: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic. acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY x linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
CACAGGAACA TACACGGGCC CG 22 
(2) INFORMATION FOR SEQ ID NO: 94: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NOx94: 

GGGAOGGGAT GATTCACGTG CCCT 24 

» 

(2) INFORMATION FOR SEQ ID NO: 95: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH x 20 bait pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY : linear 

(11) MOLECULE TYPE: DNA 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:9Sx 
CGCAAGCGAC GTGAGTCCTG 20 
(2) INFORMATION FOR SEQ ID NOx96x 

(i) SEQUENCE CHARACTERISTICS x 

(A) LENGTH x 20 1mm pairs 

(B) TYPBs nucleic acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY x linear 

(11) MOLECULE TYPE x DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
CACCTACCAC TGATCCCGGC 20 
(2) INFORMATION FOR SEQ ID NO: 97: 

(1) SEQUENCE CHARACTERISTICS x 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

111) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
GGACAAACAT CCGATTACCC 20 
(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 
(8) TYPEs nucleic acid 

(C) STKANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 

<xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 98s 
GGGGATGGAA CATCGOCCTG 
(2) INFORMATION FOR SEQ ID NOs99: 

(1) S EQ U ENC E CHARACTERISTICS: 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STKANDEDNESS s single 

(D) TOPOLOGY* linear 

(11) MOLECULE TYPES DNA 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:99s 
CGAGTCGGGC CTAACOGCCC 
(2) INFORMATION FOR SEQ ID NOslOO: 

(I) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 20 base pairs 

(B) TTPBs nucleic acid 

(C) STKANDEDNESS t single 

(D) TOPOLOGY s linear 

(11) MOLECULE TYPEs DNA 

(xi) SEQUENCE DESCRIPTION s SEQ ID NOslOOs 
GGGAGCCATC GACGCCGGTG 
(2) INFORMATION FOR SEQ ID NO: 101: 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 20 base pairs 

(B) TYPEs nucleic acid 

(C) STKANDEDNESS s single 

(D) TOPOLOGY s linear 

(II) MOLECULE TYPEs DNA 



20 
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(xl) SEQUENCE DESCRIPTION s SEQ ID NO: 101 j 
CCATAGGGCA GTTGACAGCC 



20 
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WHAT IS CLAIMED IS: 

1. A method for the purification of Myc from a mammalian 
source, wherein said method comprises: 

(a) growing mammalian cells capable of expressing c-Myc; 

(b) inducing c-Myc expression in said cells; 

(c) lysing the membrane of said mammalian cells and purifying 
nuclei therefrom; 

(d) treating said nuclei in a buffer comprising DNase I; 

(e) solubilizing said nuclei in a buffer comprising sodium 
dodccyl sulfate or urea at concentrations greater than 4 M 
and separating the nuclear pellet from die supernatant 
fraction; 

(f) apolying said supernatant fraction of step (e) to a DEAE 
Sepharose CLr€B column and eluting bound c-Myc from said 
DEAE Sepharose CL-6B column with a salt gradient; 

(g) applying said c-Myc of step (f) to a FPLC Mono-Q column 
and eluting bound c-Myc with a salt gradient. 

2. A method for the detection of CI complexes in a sample, 
wherein said method comprises detecting DNA binding of c-Myc- 
containing homo-oligomers to the DNA motif 5*-CACGTG-3\ in its double 
stranded DNA form. 

3. A method for the detection of C2 complexes in a sample, 
wherein said method comprises dftrrring DNA binding of c-Myc- 
containing hetero-oligomers to Ac DNA motif 5 , -CACGTG-3\ in its 
double stranded DNA form. 
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4. A method for the detection of C2 ' complexes in a sample, 
wherein said method comprises detecting c-Myc directed DNA binding to 
the DNA motif 5'-GAGCTG-3\ in its double stranded DNA form. 

5. A protein composition comprising at least one peptide 

5 capable of forming a G2 complex, wherein said peptide capable of forming 
a C2 complex is found in a 26-29 kD protein fraction purified from 
Chinese hamster ovary cells or baculovirus. 

6. The protein composition of claim 5, wherein said protein 
composition is pr e pa red from Chinese hamster ovary cells by a method 

10 comprising the steps of: 

(a) growing said cells; 

(b) lysine the membrane of said cells and purifying nuclei 
the ref r om ; 

(c) treating said nuclei in a buffer comprising DNase I; 
IS (d) solubilizing said nuclei in a buffer comprising sodium 

dodecyl sulfate or urea at concentrations greater than 4 M 
and separating the nuclear pellet from the supernatant 
fraction;. 

(e) applying said supernatant fraction of step (e) to a DEAE 
20 Sepharose CL-6B column and the bound C2 complex protein 

from said DEAE Sepharose CL-6B column with a salt 
gradient; and 

(g) applying the eluted C2 complex protein of step (f) to a FPLC 
Mono-Q column and eluting bound C2 complex protein with 
25 a salt gradient* 

7. The protein composition of claim 5, wherein said protein 
composition is prepared from baculovirus. 
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8, A protein composition comprising at least one peptide 
capable of forming a C2 * complex in the presence of c-Myc, wherein said 
peptide capable of forming a C2 * complex in the presence of c-Myc is 
found in a 40-50 kD protein fraction purified from GHO cells. 

9. The protein composition of claim 8, wherein said protein 
composition is prepared from Chinese hamster ovary cells by a method 
comprising the steps of: 

(a) growing said cells; 

(b) lysing the membrane of said cells and purifying nuclei 
therefrom; 

(c) treating said nuclei in a buffer comprising DNase I; 

(d) solubilizing said nuclei in a buffer comprising sodium 
dodecyl sulfate or urea at concentrations greater than 4 M 
and separating the nuclear pellet from the supernatant 
fraction; 

(e) applying said supernatant fraction of step (e) to a DEAE 
Sepharose CL-6B column and the bound C2 * complex 
protein from said DEAE Sepharose CL-6B column with a 
salt gradient; and 

(g) applying the eluted C2* complex protein of step (f) to a 
FPLC Mono-Q column and eluting bound C2' complex 
protein with a salt gradient. 

10. A method for objectively classifying compounds, including 
human pharmaceuticals, as inhibitors of c-Myc activity, wherein said 
method comprises detecting the ability of said compound to inhibit CI 
complex formation, C2 complex formation or C2' complex formation. 

11. The method of claim 10, wherein said complex formation is 
CI complex formation. 
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12. The method of claim 10, wherein said complex formation is 
C2 complex formation. 

13. The method of claim 10, wherein said complex formation is 
C2* complex formation. 

5 14. A method for objectively classifying compounds, including 

human pharmaceuticals, as inhibitors of c-Myc activity, wherein said 
method comprises detecting the ability of said compound to inhibit CI 
complex DNA binding, C2 complex DNA binding, or C2* complex DNA 
binding. 

10 15. The method of claim 14, wherein said DNA binding is CI 

complex DNA binding. 

16. The method of claim IS, wherein said DNA binding is to an 
oligonucleotide comprising the sequence 5 t -CACGTG-3\ 

17. The method of claim 15, wherein said DNA binding is to an 
15 oligonucleotide comprising the sequence 5'-CATGTG-3\ 

18. The method of claim 14, wherein said DNA binding is C2 
complex DNA binding. 

19. The method of claim 18, wherein said DNA binding is to an 
oligonucleotide comprising the sequence 5*-CACGTG-3\ 

20 20. The method of claim 18, wherein said DNA binding is to an 

oligonucleotide comprising the sequence 5*-CATGTG-3\ 
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21 . The method of claim 14, wherein said DNA binding is C2 * 
complex DNA binding. 

22. The method of claim 21, wherein said DNA binding is to an 
oligonucleotide comprising the sequence 5'-CAGCTG-3\ 

5 23. A method for the purification of a peptide capable of forming 

a C2 or C2 * complex, or a mixture of such peptides from a crude prepara- 
tion, wherein said method comprises extraction of Chinese hamster ovary 
cells and assay of said peptide by detection of the ability of said peptide to 
form said C2 or said C2' complex. 

10 24. A method for identifying and classifying a compound as an 

inhibitor of c-myc hetenxriigomer DNA binding wherein said method 
comprises evaluating the ability of said compound to alter expression of a 
reporter gene in a host cell, wherein expression of said reporter gene is 
operably-linked to DNA binding by said hetero-oligomer to an 

15 oligonucleotide comprising the sequence 5'-CACGTG-3' . 

25. The method of claim 24, wherein expression of said reporter 
gene induces a phenotypic change in a host cell. 

26. The method of claim 24, wherein said reporter gene is lonZ. 

27. The method of claim 24, wherein said reporter gene is. CAT. 

20 28. The method of claim 24, wherein said reporter gene is 

LEU2. 

29. The method of claim 24, wherein said phenotypic change is 
detected by visual inspection of the host cell. 



BNSDOCID: <WO 9308701A1 JA> 



SUBSTITUTE SHEET 



WO 93/08701 PCI7US92/08603 



10 



- 84 - 

30. The method of claim 24, wherein said host is S. cerevisiae. 

31 . The method of claim 24, wherein said host is a mammalian 



cell. 



32. A method for identifying and classifying a compound as an 
inhibitor of c-Myc^directed C2 ' hetero-ougomer DNA binding wherein said 
method comprises evaluating the ability of said compound to alter 
expression of a reporter gene in a host cell, wherein expression of said 
reporter gene is operably-linked to DNA binding by said hetero-ougomer to 
an oligonucleotide comprising the sequence 5'-CAGCTG-3\ 

33. The method of claim 32, wherein expression of said reporter 
gene induces a phenotypic change in a host cell. 

34. The method of claim 32, wherein said reporter gene is iocZ. 

35. The method of claim 32, wherein said reporter gene is CAT. 

36. The method of claim 32, wherein said reporter gene is 



15 LEW.. 



37. The method of claim 32, wherein said phenotypic change is 
detected by visual inspection of the host cell. 

38. The method of claim 32, wherein said host is S. cerevisiae. 

39. The method of claim 32, wherein said host is a mammalian 
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